gunz_cm.metrics.ren package
Subpackages
- gunz_cm.metrics.ren.third_parties package
Submodules
gunz_cm.metrics.ren.hic_spector module
Module for computing the HiCSpector reproducibility score.
This module provides functions to calculate the HiCSpector score, a metric based on the spectral properties of the normalized Laplacian matrix of Hi-C contact maps.
References
Examples
- gunz_cm.metrics.ren.hic_spector.comp_hic_spector_coo(coo_cm1: coo_matrix, coo_cm2: coo_matrix, with_loop: bool = True, num_eig_vecs: int = 20, ipr_cutoff: float = 5, op: str = 'union', single_graph_laplacian: bool = False, intersect_valid_eigvecs: bool = False, verbose: bool = False) float[source]
Calculate the HiCSpector reproducibility score between two sparse matrices.
- coo_cm1sp.coo_matrix
First contact matrix in COO format.
- coo_cm2sp.coo_matrix
Second contact matrix in COO format.
- with_loopbool, optional
If True, include loops in the calculation. By default True.
- num_eig_vecsint, optional
Number of eigenvectors to consider. By default 20.
- ipr_cutofffloat, optional
Cutoff value for the IPR filter. If None, no filter is applied. By default 5.
- opstr, optional
The operation for filtering common rows/columns (‘union’ or ‘intersection’). By default “union”.
- single_graph_laplacianbool, optional
Flag to treat the matrix as a single graph. By default False.
- intersect_valid_eigvecsbool, optional
If True, intersect valid eigenvector IDs from both matrices based on the IPR cutoff. By default False.
- verbosebool, optional
Enable verbose logging output. By default False.
- float
The HiCSpector reproducibility score.
This function computes the HiCSpector metric using spectral decomposition. It assumes input matrices are in COO Upper Triangle format.
Examples
- gunz_cm.metrics.ren.hic_spector.comp_norm_laplacian_mat_coo(cm_coo: coo_matrix, single_graph_laplacian: bool = False, with_loop: bool = True, is_triu_sym: bool = True) coo_matrix[source]
Calculate the normalized Laplacian matrix from a COO matrix.
This function assumes the input matrix is symmetric and that only the upper triangular part (including the diagonal) is stored.
- cm_coosp.coo_matrix
The input adjacency matrix in COO format.
- single_graph_laplacianbool, optional
Whether to treat the input as a single graph, which affects how the adjacency matrix is derived. By default False.
- with_loopbool, optional
Whether to include the diagonal in the adjacency matrix derivation. By default True.
- is_triu_symbool, optional
Whether the input is a symmetric matrix stored in upper-triangular form. By default True.
- sp.coo_matrix
The normalized Laplacian matrix in COO format.
Examples
- gunz_cm.metrics.ren.hic_spector.vec_calc_eig_vec_dist(vecs1: ndarray, vecs2: ndarray) ndarray[source]
Calculate the minimum Euclidean distance between corresponding rows of two matrices of eigenvectors.
This function accounts for the directional ambiguity of eigenvectors by computing the distance between vecs1 and vecs2, as well as vecs1 and -vecs2, and returning the minimum of the two for each pair of rows.
- vecs1np.ndarray
The first set of eigenvectors (2D array, rows are eigenvectors).
- vecs2np.ndarray
The second set of eigenvectors (2D array, rows are eigenvectors).
- np.ndarray
The minimum Euclidean distance for each pair of eigenvectors.
Examples
- gunz_cm.metrics.ren.hic_spector.vec_comp_ipr4(eig_vecs: ndarray) ndarray[source]
Compute the Inverse Participation Ratio (IPR) to the 4th power in a vectorized manner.
- eig_vecsnp.ndarray
A 2D array of eigenvectors (each row is an eigenvector).
- np.ndarray
The IPR4 values for each eigenvector.
Examples
gunz_cm.metrics.ren.hicrep module
Module for computing the HiCRep reproducibility score for Hi-C contact matrices.
This module implements the HiCRep algorithm, a framework for assessing the reproducibility of Hi-C data. It includes functions for preprocessing, variance stabilization, and calculating the stratum-adjusted correlation coefficient (SCC).
References
T. Hi, V. G. Rao, and S. S. P. Wingett, “HiCRep: a stratum-adjusted correlation coefficient for assessing the reproducibility of Hi-C data,” Genome Biology, vol. 18, no. 1, p. 216, 2017.
Examples
- gunz_cm.metrics.ren.hicrep.comp_hicrep_coo(cm1_coo: coo_matrix, cm2_coo: coo_matrix, max_k: int = None, remove_main_diag: bool = True, downsample: bool = False, half_win_size: int | None = None, ena_common_region: bool = True, ena_reshaping: bool = True) ndarray[source]
Function comp_hicrep_coo.
Examples
Notes
- gunz_cm.metrics.ren.hicrep.diag_transform_coo(cm_coo: coo_matrix) csr_matrix[source]
Convert a sparse COO matrix into a sparse CSR matrix of its diagonals.
Each row in the output CSR matrix represents a diagonal from the upper triangle of the input matrix.
- cm_coosp.coo_matrix
Input sparse matrix in COO format. Must be a square matrix.
- sp.csr_matrix
A sparse CSR matrix where each row i contains the elements of the i+1-th diagonal of the input matrix.
Examples
- gunz_cm.metrics.ren.hicrep.mean_filter_coo_mat(mat: coo_matrix, half_win_size: int) coo_matrix[source]
Apply a mean filter to a sparse COO matrix.
This function convolves the input with a square kernel of constant entries to perform smoothing.
- matsp.coo_matrix
The input matrix to be filtered.
- half_win_sizeint
The half-size of the filter window (h). The full kernel will be (2*h + 1) x (2*h + 1).
- sp.coo_matrix
The filtered (smoothed) matrix.
- ValueError
If half_win_size is not positive or if the input matrix is not square.
- TypeError
If the input matrix is not a SciPy COO matrix.
The filter is a square matrix of constant 1s. Edge effects are handled by adjusting the normalization factor based on the number of neighbors for each element.
Examples
- gunz_cm.metrics.ren.hicrep.preprocess_matrices_coo(cm_coo1: coo_matrix, cm_coo2: coo_matrix, max_k: int | None = None, remove_main_diag: bool = True, downsample: bool = False, half_win_size: int | None = None) Tuple[coo_matrix, coo_matrix][source]
Function preprocess_matrices_coo.
Examples
Notes
- gunz_cm.metrics.ren.hicrep.resample_coo_mat(mat: coo_matrix, target_sum: int) coo_matrix[source]
Resample a sparse matrix to a new total sum.
This function uses sampling with replacement to adjust the total number of counts in a sparse matrix to a target sum.
- matsp.coo_matrix
The input matrix to be resampled.
- target_sumint
The desired total sum of the output matrix.
- sp.coo_matrix
The resampled matrix with the new total sum.
Examples
- gunz_cm.metrics.ren.hicrep.variance_stabilizing_transform_variance(sample_size: int | ndarray) int | ndarray[source]
Calculate the variance for variance-stabilizing transformation.
- sample_sizeint | np.ndarray
The size of the input data.
- int | np.ndarray
The variance of the ranked input data, with Bessel’s correction.
The variance-stabilizing transform turns input data into ranks. The variance of these ranks is a function of only the sample size n: var = (1 + 1/n) / 12 (with Bessel’s correction). See [1] for more details.
>>> variance_stabilizing_transform_variance(5) 0.1
Examples
gunz_cm.metrics.ren.stat_test module
Module for statistical test helpers.
This module provides convenient wrappers for common statistical tests, pre-configured for specific use cases.
Examples
- gunz_cm.metrics.ren.stat_test.one_sided_gt_wilcoxon(x, y=None, zero_method='wilcox', correction=False, *, alternative='greater', method='auto', axis=0, nan_policy='propagate', keepdims=False)
One-sided Wilcoxon signed-rank test for the alternative hypothesis that the distribution of x is stochastically greater than that of y.
- Parameters:
x (array_like) – The first set of measurements.
y (array_like) – The second set of measurements.
- Returns:
The result object from scipy.stats.wilcoxon.
- Return type:
WilcoxonResult
- gunz_cm.metrics.ren.stat_test.one_sided_lt_wilcoxon(x, y=None, zero_method='wilcox', correction=False, *, alternative='less', method='auto', axis=0, nan_policy='propagate', keepdims=False)
One-sided Wilcoxon signed-rank test for the alternative hypothesis that the distribution of x is stochastically less than that of y.
- Parameters:
x (array_like) – The first set of measurements.
y (array_like) – The second set of measurements.
- Returns:
The result object from scipy.stats.wilcoxon.
- Return type:
WilcoxonResult