gunz_cm.metrics.ren package

Subpackages

Submodules

gunz_cm.metrics.ren.hic_spector module

Module for computing the HiCSpector reproducibility score.

This module provides functions to calculate the HiCSpector score, a metric based on the spectral properties of the normalized Laplacian matrix of Hi-C contact maps.

References

Examples

gunz_cm.metrics.ren.hic_spector.comp_hic_spector_coo(coo_cm1: coo_matrix, coo_cm2: coo_matrix, with_loop: bool = True, num_eig_vecs: int = 20, ipr_cutoff: float = 5, op: str = 'union', single_graph_laplacian: bool = False, intersect_valid_eigvecs: bool = False, verbose: bool = False) float[source]

Calculate the HiCSpector reproducibility score between two sparse matrices.

coo_cm1sp.coo_matrix

First contact matrix in COO format.

coo_cm2sp.coo_matrix

Second contact matrix in COO format.

with_loopbool, optional

If True, include loops in the calculation. By default True.

num_eig_vecsint, optional

Number of eigenvectors to consider. By default 20.

ipr_cutofffloat, optional

Cutoff value for the IPR filter. If None, no filter is applied. By default 5.

opstr, optional

The operation for filtering common rows/columns (‘union’ or ‘intersection’). By default “union”.

single_graph_laplacianbool, optional

Flag to treat the matrix as a single graph. By default False.

intersect_valid_eigvecsbool, optional

If True, intersect valid eigenvector IDs from both matrices based on the IPR cutoff. By default False.

verbosebool, optional

Enable verbose logging output. By default False.

float

The HiCSpector reproducibility score.

This function computes the HiCSpector metric using spectral decomposition. It assumes input matrices are in COO Upper Triangle format.

Examples

gunz_cm.metrics.ren.hic_spector.comp_norm_laplacian_mat_coo(cm_coo: coo_matrix, single_graph_laplacian: bool = False, with_loop: bool = True, is_triu_sym: bool = True) coo_matrix[source]

Calculate the normalized Laplacian matrix from a COO matrix.

This function assumes the input matrix is symmetric and that only the upper triangular part (including the diagonal) is stored.

cm_coosp.coo_matrix

The input adjacency matrix in COO format.

single_graph_laplacianbool, optional

Whether to treat the input as a single graph, which affects how the adjacency matrix is derived. By default False.

with_loopbool, optional

Whether to include the diagonal in the adjacency matrix derivation. By default True.

is_triu_symbool, optional

Whether the input is a symmetric matrix stored in upper-triangular form. By default True.

sp.coo_matrix

The normalized Laplacian matrix in COO format.

Examples

gunz_cm.metrics.ren.hic_spector.vec_calc_eig_vec_dist(vecs1: ndarray, vecs2: ndarray) ndarray[source]

Calculate the minimum Euclidean distance between corresponding rows of two matrices of eigenvectors.

This function accounts for the directional ambiguity of eigenvectors by computing the distance between vecs1 and vecs2, as well as vecs1 and -vecs2, and returning the minimum of the two for each pair of rows.

vecs1np.ndarray

The first set of eigenvectors (2D array, rows are eigenvectors).

vecs2np.ndarray

The second set of eigenvectors (2D array, rows are eigenvectors).

np.ndarray

The minimum Euclidean distance for each pair of eigenvectors.

Examples

gunz_cm.metrics.ren.hic_spector.vec_comp_ipr4(eig_vecs: ndarray) ndarray[source]

Compute the Inverse Participation Ratio (IPR) to the 4th power in a vectorized manner.

eig_vecsnp.ndarray

A 2D array of eigenvectors (each row is an eigenvector).

np.ndarray

The IPR4 values for each eigenvector.

Examples

gunz_cm.metrics.ren.hicrep module

Module for computing the HiCRep reproducibility score for Hi-C contact matrices.

This module implements the HiCRep algorithm, a framework for assessing the reproducibility of Hi-C data. It includes functions for preprocessing, variance stabilization, and calculating the stratum-adjusted correlation coefficient (SCC).

References

Examples

gunz_cm.metrics.ren.hicrep.comp_hicrep_coo(cm1_coo: coo_matrix, cm2_coo: coo_matrix, max_k: int = None, remove_main_diag: bool = True, downsample: bool = False, half_win_size: int | None = None, ena_common_region: bool = True, ena_reshaping: bool = True) ndarray[source]

Function comp_hicrep_coo.

Examples

Notes

gunz_cm.metrics.ren.hicrep.diag_transform_coo(cm_coo: coo_matrix) csr_matrix[source]

Convert a sparse COO matrix into a sparse CSR matrix of its diagonals.

Each row in the output CSR matrix represents a diagonal from the upper triangle of the input matrix.

cm_coosp.coo_matrix

Input sparse matrix in COO format. Must be a square matrix.

sp.csr_matrix

A sparse CSR matrix where each row i contains the elements of the i+1-th diagonal of the input matrix.

Examples

gunz_cm.metrics.ren.hicrep.mean_filter_coo_mat(mat: coo_matrix, half_win_size: int) coo_matrix[source]

Apply a mean filter to a sparse COO matrix.

This function convolves the input with a square kernel of constant entries to perform smoothing.

matsp.coo_matrix

The input matrix to be filtered.

half_win_sizeint

The half-size of the filter window (h). The full kernel will be (2*h + 1) x (2*h + 1).

sp.coo_matrix

The filtered (smoothed) matrix.

ValueError

If half_win_size is not positive or if the input matrix is not square.

TypeError

If the input matrix is not a SciPy COO matrix.

The filter is a square matrix of constant 1s. Edge effects are handled by adjusting the normalization factor based on the number of neighbors for each element.

Examples

gunz_cm.metrics.ren.hicrep.preprocess_matrices_coo(cm_coo1: coo_matrix, cm_coo2: coo_matrix, max_k: int | None = None, remove_main_diag: bool = True, downsample: bool = False, half_win_size: int | None = None) Tuple[coo_matrix, coo_matrix][source]

Function preprocess_matrices_coo.

Examples

Notes

gunz_cm.metrics.ren.hicrep.resample_coo_mat(mat: coo_matrix, target_sum: int) coo_matrix[source]

Resample a sparse matrix to a new total sum.

This function uses sampling with replacement to adjust the total number of counts in a sparse matrix to a target sum.

matsp.coo_matrix

The input matrix to be resampled.

target_sumint

The desired total sum of the output matrix.

sp.coo_matrix

The resampled matrix with the new total sum.

Examples

gunz_cm.metrics.ren.hicrep.variance_stabilizing_transform_variance(sample_size: int | ndarray) int | ndarray[source]

Calculate the variance for variance-stabilizing transformation.

sample_sizeint | np.ndarray

The size of the input data.

int | np.ndarray

The variance of the ranked input data, with Bessel’s correction.

The variance-stabilizing transform turns input data into ranks. The variance of these ranks is a function of only the sample size n: var = (1 + 1/n) / 12 (with Bessel’s correction). See [1] for more details.

>>> variance_stabilizing_transform_variance(5)
0.1

Examples

gunz_cm.metrics.ren.stat_test module

Module for statistical test helpers.

This module provides convenient wrappers for common statistical tests, pre-configured for specific use cases.

Examples

gunz_cm.metrics.ren.stat_test.one_sided_gt_wilcoxon(x, y=None, zero_method='wilcox', correction=False, *, alternative='greater', method='auto', axis=0, nan_policy='propagate', keepdims=False)

One-sided Wilcoxon signed-rank test for the alternative hypothesis that the distribution of x is stochastically greater than that of y.

Parameters:
  • x (array_like) – The first set of measurements.

  • y (array_like) – The second set of measurements.

Returns:

The result object from scipy.stats.wilcoxon.

Return type:

WilcoxonResult

gunz_cm.metrics.ren.stat_test.one_sided_lt_wilcoxon(x, y=None, zero_method='wilcox', correction=False, *, alternative='less', method='auto', axis=0, nan_policy='propagate', keepdims=False)

One-sided Wilcoxon signed-rank test for the alternative hypothesis that the distribution of x is stochastically less than that of y.

Parameters:
  • x (array_like) – The first set of measurements.

  • y (array_like) – The second set of measurements.

Returns:

The result object from scipy.stats.wilcoxon.

Return type:

WilcoxonResult

Module contents