User Guide#
version: 1.0.0 status: active
The User Guide is the narrative companion to the auto-generated API reference. It explains the why and when for each top-level module, not just the how. Read these pages first to decide which gunz-cm module is right for your task, then consult the API reference for parameter details.
Pipeline overview#
gunz-cm follows a 5-stage pipeline: load → preprocess → convert → visualize → reconstruct/metrics. The table below maps each stage to the modules that implement it.
Stage |
Modules |
Typical workflow |
|---|---|---|
Load |
:doc: |
|
Preprocess |
:doc: |
Filter empty bins, apply KR/ICE balancing |
Convert |
:doc: |
Write to COO, GZCM, or MEMMAP |
Visualize |
:doc: |
Render contact maps and 3D structures |
Reconstruct / metrics |
:doc: |
Infer 3D structures, score quality |
ML training (optional) |
:doc: |
Train a super-resolution model |
Workflow orchestration |
:doc: |
Chain steps into a reproducible recipe |
Module index#
Entry-point trio#
These three modules are where every new user starts.
:doc:
root— The top-levelgunz_cmnamespace.load_cm_data,ContactMatrix,Balancing,Format, exceptions.:doc:
cli— Command-line interface.gunz-cm loaders,gunz-cm converters,gunz-cm recon.:doc:
loaders— File format loaders..hic,.cool,.mcool, CSV, GZCM, PICKLE/NPY.:doc:
converters— Format conversion. HIC→COOL, COO text, GZCM v1/v2/v3, MEMMAP.
Preprocessing layer#
:doc:
preprocs— Filter, downsample, transform contact matrices and 3D points.:doc:
normalization— KR, ICE, VC matrix balancing algorithms.:doc:
structs— Core data types (Region, Constant, ClosedInterval). Deprecated; usegunz_utils.:doc:
utils— Internal helpers. Use with care — these are not part of the stable public API.
Storage layer#
:doc:
compressions— 6 codecs for GZCM v3 tile compression (BSC, CMC, CMC_ZSTD, ZSTD, BSCM_CMC variants).:doc:
io— Low-level GZCM readers/writers.GzcmReader,GzcmWriter,GzcmChunkedReader/Writer.:doc:
samplers—SpatialBatchSamplerfor PyTorch DataLoader on genomic data.:doc:
datasets— PyTorchDatasetimplementations (HiCSparseDataset,GzcmDataset).
Analysis layer#
:doc:
metrics— Reconstruction metrics (Procrustes RMSE, R^2) and resolution-enhancement metrics (MSE, SSIM, PSNR).:doc:
pipeline— Workflow orchestration.Pipelineclass,create_pipelinefactory.
ML + UI layer#
:doc:
reconstructions— 3D structure inference. Classical MDS, PO-MDS. GPU support via torch.:doc:
resolution_enhancements— Super-resolution datasets and transforms. Requiresrenorren-gpuextra.:doc:
visualizations— Display helpers (2D) and 3D structure plotters (Plotly WebGL).
How to use this guide#
If you are new to gunz-cm, start with :doc:root to understand the top-level API, then read the Entry-point trio in order. The first tutorial in :doc:../tutorials/index (Load HIC) is the recommended first hands-on exercise.
If you are migrating from cooler or scvi-tools, focus on :doc:loaders and :doc:converters — these cover the same surface area with different conventions.
If you are building a custom pipeline (e.g., a multi-sample analysis), read :doc:pipeline after the entry-point trio.
If you are debugging an error, check the Known issues section at the bottom of each module page first. Most pre-existing bugs are documented there with workarounds.
Where to go next#
:doc:
../tutorials/index— Step-by-step tutorials (load → convert → visualize → random downsample):doc:
../concepts— Core concepts: lazy loading, balanced vs raw counts, region syntax:doc:
../changelog— Version history and breaking changes:doc:
../gunz_cm— Full auto-generated API reference