gunz_cm.pipeline.sprite package

Submodules

gunz_cm.pipeline.sprite.downweighting module

Module.

Examples

class gunz_cm.pipeline.sprite.downweighting.Downweighting(value)[source]

Bases: Enum

An enumeration of downweighting schemes.

NONE – No downweighting. Each contact has a value of 1. N_MINUS_ONE – A contact from a cluster of n reads has a value of

1 /(n - 1).

TWO_OVER_N – A contact form a cluster of n reads has a value of

2 / n.

UNKNOWN – A default downweighing scheme for error checking.

Examples

NONE = None
N_MINUS_ONE = 'n_minus_one'
TWO_OVER_N = 'two_over_n'
UNKNOWN = 'unknown'
estimate_num_contacts(bins)[source]

Function estimate_num_contacts.

Examples

Notes

static from_str(label: str | None) Downweighting[source]

Converts a string label to the corresponding Downweighting enum member.

This method is case-insensitive and returns Downweighting.UNKNOWN if no matching label is found.

labelOptional[str]

The string label to convert.

Downweighting

The corresponding Downweighting enum member.

Examples

gunz_cm.pipeline.sprite.sprite_cm module

Module.

Examples

gunz_cm.pipeline.sprite.sprite_cm.create_sprite_cm(clusters_fpath: str, chrom: str, build: str = 'hg19', resolution: int = 1000000, min_cluster_size: int = 2, max_cluster_size: int = 1000, min_nway_interactions: int | None = None, max_nway_interactions: int | None = None, downweighting: str | None = None, cache_file: bool = False) coo_matrix[source]

Creates a SPRITE contact matrix from a file containing genomic clusters.

  • Currently only supports intra-chromosomal contact.

  • The function reads a file containing genomic clusters and constructs a sparse contact matrix.

  • The contact matrix is built based on the specified resolution and downweighting method.

  • Downweighting methods include:
    • NONE: No downweighting. Each contact has a value of 1.

    • N_MINUS_ONE: A contact from a cluster of n reads has a value of 1 / (n - 1).

    • TWO_OVER_N: A contact from a cluster of n reads has a value of 2 / n.

    • UNKNOWN: A default downweighting scheme for error checking.

  • Supported genome builds are:
    • hg19: Human genome build 19 or GrCh37.

    • hg38: Human genome build 38 or GrCh38.

    • mm9: Mouse genome build 9.

    • mm10: Mouse genome build 10.

clusters_fpathstr

The file path to the clusters file.

chromstr

The chromosome to process.

buildstr, optional

The genome build (default is “hg19”).

resolutionint, optional

The resolution of the contact matrix in base pairs (default is 1,000,000).

min_cluster_sizeint, optional

The minimum size of a cluster to be considered (default is 2).

max_cluster_sizeint, optional

The maximum size of a cluster to be considered (default is 1,000).

min_nway_interactionsOptional[int], optional

The minimum number of n-way interactions to consider (default is None).

max_nway_interactionsOptional[int], optional

The maximum number of n-way interactions to consider (default is None).

downweightingOptional[str], optional

The downweighting method to use (default is None).

cache_filebool, optional

Whether to cache the cluster file content in memory (default is False).

sp.coo_matrix

A sparse contact matrix in COO format.

Examples

Module contents