gunz_cm package

Subpackages

Submodules

gunz_cm.consts module

Defines shared constants, enumerations, and data structures for the library.

This module centralizes common values used throughout the application, including DataFrame column names, data types, supported file formats, and standard genomic build information. Using this module ensures consistency and simplifies maintenance.

Examples

class gunz_cm.consts.Backend(value)[source]

Bases: BaseStrEnum

Enumeration for interaction matrix loader backends.

Examples

COOLER = 'cooler'
HICSTRAW = 'hicstraw'
HICTK = 'hictk'
STRAW = 'straw'
class gunz_cm.consts.Balancing(value)[source]

Bases: BaseStrEnum

Enumeration for matrix balancing (normalization) methods.

Examples

KR = 'KR'
NONE = 'NONE'
VC = 'VC'
VC_SQRT = 'VC_SQRT'
class gunz_cm.consts.BpFrag(value)[source]

Bases: BaseStrEnum

Enumeration for binning units (Base Pairs vs. Fragments).

Examples

BP = 'BP'
FRAG = 'FRAG'
class gunz_cm.consts.Counts(value)[source]

Bases: BaseStrEnum

Enumeration for different types of interaction counts.

Examples

EXPECTED = 'expected'
OBSERVED = 'observed'
OE = 'oe'
gunz_cm.consts.DS

alias of DataStructure

class gunz_cm.consts.DataStructure(value)[source]

Bases: BaseStrEnum

Enumeration for in-memory data representations.

Examples

COO = 'coo'
DF = 'df'
RC = 'rc'
RCV = 'rcv'
class gunz_cm.consts.Format(value)[source]

Bases: BaseStrEnum

Enumeration for supported file formats.

Uses BaseStrEnum for case-insensitivity and aliases.

Examples

COO = 'coo'
COOLER = 'cooler'
CSV = 'csv'
GINTERACTIONS = 'ginteractions'
HIC = 'hic'
MCOO = 'mcoo'
MCSV = 'mcsv'
MEMMAP = 'npdat'
NPY = 'npy'
PICKLE = 'pickle'
TSV = 'tsv'
class gunz_cm.consts.GenomeBuild(value)[source]

Bases: BaseStrEnum

Enumeration for standard genome builds.

Examples

HG19 = 'hg19'
HG38 = 'hg38'
MM10 = 'mm10'
MM9 = 'mm9'

gunz_cm.exceptions module

Custom exception classes for the gunz_cm package.

exception gunz_cm.exceptions.ConversionFailedError(region: str, message: str = 'Conversion failed')[source]

Bases: ConverterError

Exception raised when a conversion process fails.

Parameters:
  • region (str) – The region string for which the conversion failed.

  • message (str, optional) – A custom error message.

exception gunz_cm.exceptions.ConverterError[source]

Bases: GunzCMError

Base class for exceptions in the converters module.

exception gunz_cm.exceptions.DataResolutionError[source]

Bases: LoaderError

Exception raised when there’s an issue with data resolution.

exception gunz_cm.exceptions.DatasetError[source]

Bases: GunzCMError

Base class for exceptions in the datasets module.

exception gunz_cm.exceptions.FormatError[source]

Bases: LoaderError

Exception raised for format-related errors.

exception gunz_cm.exceptions.GunzCMError[source]

Bases: Exception

Base class for all custom exceptions in the gunz_cm package.

Examples

>>> try:
...     raise GunzCMError("A matrix error occurred")
... except GunzCMError as e:
...     print(e)
A matrix error occurred
exception gunz_cm.exceptions.IOError[source]

Bases: GunzCMError

Base class for exceptions related to input/output operations.

exception gunz_cm.exceptions.InvalidRegionFormatError(region: str, message: str = 'Invalid region format')[source]

Bases: LoaderError

Exception raised for errors in the input region format.

Parameters:
  • region (str) – The invalid region string that caused the error.

  • message (str, optional) – A custom error message.

exception gunz_cm.exceptions.LoaderError[source]

Bases: GunzCMError

Base class for exceptions in the loaders module.

exception gunz_cm.exceptions.MetricError[source]

Bases: GunzCMError

Base class for exceptions in the metrics module.

exception gunz_cm.exceptions.PreprocError[source]

Bases: GunzCMError

Base class for exceptions in the preprocs module.

exception gunz_cm.exceptions.ReconstructionError[source]

Bases: GunzCMError

Base class for exceptions in the reconstructions module.

exception gunz_cm.exceptions.UnsupportedLoaderFeatureError(feature: str, loader_name: str)[source]

Bases: LoaderError

Exception raised when a loader does not support a requested feature.

Parameters:
  • feature (str) – The name of the unsupported feature.

  • loader_name (str) – The name of the loader that does not support the feature.

gunz_cm.matrix module

Defines the ContactMatrix data structure.

class gunz_cm.matrix.ContactMatrix(chromosome1: str, resolution: int, loader_func: Callable, loader_kwargs: Dict[str, ~typing.Any]=<factory>, chromosome2: str | None = None, metadata: Dict[str, ~typing.Any]=<factory>)[source]

Bases: object

A data container for a contact matrix and its associated metadata.

This class acts as a simple, data-oriented container to group a contact matrix (as a pandas DataFrame or a SciPy sparse matrix) with important metadata like its genomic coordinates and resolution. It supports lazy loading of data via a loader function.

chromosome1

The name of the first chromosome.

Type:

str

resolution

The resolution of the contact matrix in base pairs.

Type:

int

loader_func

A function or callable that returns the raw data when called.

Type:

callable

loader_kwargs

Keyword arguments to pass to the loader function.

Type:

dict

chromosome2

The name of the second chromosome, if different from the first (for inter-chromosomal matrices). Defaults to chromosome1.

Type:

str, optional

metadata

A dictionary to hold any other relevant metadata.

Type:

dict

Examples

>>> from gunz_cm.matrix import ContactMatrix
>>> import numpy as np
>>> def dummy_loader(n): return np.eye(n)
>>> cm = ContactMatrix("chr1", 10000, loader_func=dummy_loader, loader_kwargs={"n": 5})
>>> print(cm.data.shape)
(5, 5)
as_coo() coo_matrix[source]

Returns the contact matrix as a SciPy COO sparse matrix.

Returns:

The contact matrix data in COO format.

Return type:

scipy.sparse.coo_matrix

Examples

>>> cm = load_cm_data("sample.cool", "chr1", 10000)
>>> coo = cm.as_coo()
>>> print(f"Non-zero elements: {coo.nnz}")
as_csc() csc_matrix[source]

Returns the contact matrix as a SciPy CSC sparse matrix.

Returns:

The contact matrix data in CSC format.

Return type:

scipy.sparse.csc_matrix

as_csr() csr_matrix[source]

Returns the contact matrix as a SciPy CSR sparse matrix.

Returns:

The contact matrix data in CSR format.

Return type:

scipy.sparse.csr_matrix

as_dataframe() DataFrame[source]

Returns the contact matrix as a pandas DataFrame.

Returns:

The contact matrix data as a DataFrame with bin IDs and counts.

Return type:

pandas.DataFrame

Examples

>>> df = cm.as_dataframe()
>>> print(df.columns)
Index(['bin1_id', 'bin2_id', 'count'], dtype='object')
chromosome1: str
chromosome2: str | None = None
property data: Any

The raw contact matrix data, loaded lazily.

Returns:

The raw data returned by the loader function (usually a DataFrame or Sparse Matrix).

Return type:

any

loader_func: Callable
loader_kwargs: Dict[str, Any]
metadata: Dict[str, Any]
resolution: int

Module contents