gunz_cm.pipeline.sprite package
Submodules
gunz_cm.pipeline.sprite.downweighting module
Module.
Examples
- class gunz_cm.pipeline.sprite.downweighting.Downweighting(value)[source]
Bases:
EnumAn enumeration of downweighting schemes.
NONE – No downweighting. Each contact has a value of 1. N_MINUS_ONE – A contact from a cluster of n reads has a value of
1 /(n - 1).
- TWO_OVER_N – A contact form a cluster of n reads has a value of
2 / n.
UNKNOWN – A default downweighing scheme for error checking.
Examples
- NONE = None
- N_MINUS_ONE = 'n_minus_one'
- TWO_OVER_N = 'two_over_n'
- UNKNOWN = 'unknown'
- static from_str(label: str | None) Downweighting[source]
Converts a string label to the corresponding Downweighting enum member.
This method is case-insensitive and returns Downweighting.UNKNOWN if no matching label is found.
- labelOptional[str]
The string label to convert.
- Downweighting
The corresponding Downweighting enum member.
Yeremia G. Adhisantoso (adhisant@tnt.uni-hannover.de)
Qwen2.5 72B - 4.25bpw
Examples
gunz_cm.pipeline.sprite.sprite_cm module
Module.
Examples
- gunz_cm.pipeline.sprite.sprite_cm.create_sprite_cm(clusters_fpath: str, chrom: str, build: str = 'hg19', resolution: int = 1000000, min_cluster_size: int = 2, max_cluster_size: int = 1000, min_nway_interactions: int | None = None, max_nway_interactions: int | None = None, downweighting: str | None = None, cache_file: bool = False) coo_matrix[source]
Creates a SPRITE contact matrix from a file containing genomic clusters.
Currently only supports intra-chromosomal contact.
The function reads a file containing genomic clusters and constructs a sparse contact matrix.
The contact matrix is built based on the specified resolution and downweighting method.
- Downweighting methods include:
NONE: No downweighting. Each contact has a value of 1.
N_MINUS_ONE: A contact from a cluster of n reads has a value of 1 / (n - 1).
TWO_OVER_N: A contact from a cluster of n reads has a value of 2 / n.
UNKNOWN: A default downweighting scheme for error checking.
- Supported genome builds are:
hg19: Human genome build 19 or GrCh37.
hg38: Human genome build 38 or GrCh38.
mm9: Mouse genome build 9.
mm10: Mouse genome build 10.
- clusters_fpathstr
The file path to the clusters file.
- chromstr
The chromosome to process.
- buildstr, optional
The genome build (default is “hg19”).
- resolutionint, optional
The resolution of the contact matrix in base pairs (default is 1,000,000).
- min_cluster_sizeint, optional
The minimum size of a cluster to be considered (default is 2).
- max_cluster_sizeint, optional
The maximum size of a cluster to be considered (default is 1,000).
- min_nway_interactionsOptional[int], optional
The minimum number of n-way interactions to consider (default is None).
- max_nway_interactionsOptional[int], optional
The maximum number of n-way interactions to consider (default is None).
- downweightingOptional[str], optional
The downweighting method to use (default is None).
- cache_filebool, optional
Whether to cache the cluster file content in memory (default is False).
- sp.coo_matrix
A sparse contact matrix in COO format.
Yeremia G. Adhisantoso (adhisant@tnt.uni-hannover.de)
Qwen2.5 72B - 4.25bpw
Examples