Getting Started with Gunz-CM#
Welcome to Gunz-CM, a high-performance Python library for loading, manipulating, and analyzing genomic contact matrices from Hi-C and chromatin conformation capture experiments.
This page helps you pick the right starting point based on what you
want to do. Each workflow links to a step-by-step tutorial in
notebooks/ that you can run locally with Jupyter.
Three common starting points#
1. “I have a .hic / .cool file and want a matrix”#
If you’re starting from raw Hi-C data, the loaders API gives you a
single entry point for all formats. The unified load_cm_data(...)
function dispatches to the right format-specific loader based on file
extension.
Tutorials (in notebooks/, run with jupyter lab):
tutorial_load_hic.ipynb— load HIC/COOL files; understand region selection; covers the v2.7.0 region1-optional and v2.8.0 chrom name normalization featurestutorial_load_cooler.ipynb— multi-resolution.mcoolworkflows; iterate over all zoom levelstutorial_load_csv_coo.ipynb— sparse CSV/COO text files (with the base-pair vs bin-id gotcha documented)tutorial_load_gzcm.ipynb— GZCM v1/v2/v3 read+writetutorial_load_pickle_npy.ipynb— pickled DataFrames and raw NumPy arrays
Quick code sketch:
from gunz_cm.loaders import load_cm_data, get_resolutions, get_chrom_infos
from gunz_cm.consts import DataStructure
# 1. Inspect the file
print(get_resolutions("data/sample.mcool")) # [1000, 5000, 10000, ...]
print(get_chrom_infos("data/sample.mcool")) # {'chr1': {...}, ...}
# 2. Load a region
cm = load_cm_data(
fpath="data/sample.mcool",
resolution=10_000, # 10kb
region1="chr1",
region2="chr1",
output_format=DataStructure.COO, # sparse scipy matrix
)
2. “I want to normalize / filter / process the matrix”#
Preprocessing in gunz-cm is a pipeline of functional transformations: load → filter (remove bad bins) → balance (KR/ICE) → analyze. Each step has a dedicated module.
Tutorials (in notebooks/):
tutorial_filter_normalize.ipynb— filter empty bins, then KR-normalizetutorial_balance_kr.ipynb— KR and ICE balancing in detail, with before/after heatmapstutorial_downsample.ipynb— manual binning for resolution enhancementtutorial_multi_resolution.ipynb— compare a contact matrix at multiple resolutions
3. “I want to do 3D structure reconstruction or compartment analysis”#
Downstream analysis of contact matrices includes 3D chromosome
reconstruction (MDS-based methods) and A/B compartment calling. The
reconstructions and visualizations modules provide the core
functions, plus quality-assessment metrics.
Tutorials (in notebooks/):
tutorial_3d_reconstruction.ipynb— basic 3D reconstruction from a contact matrix (uses sklearn.MDS since scipy.mdscale was removed)tutorial_3d_quality_check.ipynb— Procrustes alignment, RMSE, and shape similarity between two reconstructed structurestutorial_compartments.ipynb— A/B compartment analysis and PC1 computation
What you need to get started#
Before running the tutorials, make sure you have:
Python 3.11+ (gunz-cm does not support earlier versions)
gunz-cm installed in a virtual environment — see :doc:
installationfor setup instructionsJupyter to run the tutorial notebooks:
pip install jupyter
(Optional) Sample data — most tutorials generate synthetic data inline, so you can run them without any real Hi-C files. To use your own data, see :doc:
quickstartfor the file format requirements.
Running a tutorial#
The tutorials live in notebooks/ at the root of the gunz-cm
repository. They are not rendered on the website (a known ecosystem
limitation with myst_parser + myst_nb — see
docs/MANUAL_UPLOAD.md for details). To run one:
# Clone the repository (if you haven't already)
git clone https://github.com/sXperfect/gunz-cm.git
cd gunz-cm
# Make sure your virtual environment is active
mamba activate gunz_cm
# Launch Jupyter
jupyter lab notebooks/
Then open any tutorial_*.ipynb file and run the cells. Each tutorial
is self-contained: it generates its own synthetic data and runs
end-to-end in 1-3 minutes.
You can also browse the tutorials on GitHub without running them locally: https://github.com/sXperfect/gunz-cm/tree/main/notebooks
Where to go next#
:doc:
installation— full installation guide with mamba/venv options:doc:
quickstart— minimal 5-minute working example:doc:
concepts— mental model behind the library (unified ContactMatrix facade, lazy loading, preprocessing pipeline):doc:
modules— full API reference (autodoc-generated from docstrings)
Need help?#
GitHub issues: https://github.com/sXperfect/gunz-cm/issues
Live docs: https://gunz-cm.pages.dev/
Source: https://github.com/sXperfect/gunz-cm