gunz_cm.compressions#
Submodules#
Compression codecs for GZCM v3 contact matrix tiles.
Benchmarks (GM12878 chr1 @ 50kb, tile_size=512, window=1Mb):
(*) cmc_zstd offers the best balance of compression ratio and convert speed.
bsc_cmc: CMC transforms (binarization, diagonal transform) + BSC entropy coding. Same compression as CMC with faster access. Best overall codec for storage-constrained use.
Examples
>>> from gunz_cm.compressions import CmcZstdEncoder, CmcZstdDecoder
>>> encoder = CmcZstdEncoder(tile_size=512)
>>> encoded = encoder.encode_tile(tile_data)
>>> decoder = CmcZstdDecoder(tile_size=512)
>>> decoded = decoder.decode_tile(encoded)
- class gunz_cm.compressions.BscCmcDecoder(tile_size: int = 512, resolution: int = 50000, dtype: ~numpy.dtype = <class 'numpy.uint32'>, diag_mode: int = 0)[source]#
Bases:
objectBSC + CMC Transforms decoder for contact matrix tiles.
Decodes BSC-compressed data that was encoded with CMC transforms. Reverses BSC entropy coding then CMC’s domain-specific transforms.
- Parameters:
Examples
- class gunz_cm.compressions.BscCmcEncoder(tile_size: int = 512, resolution: int = 50000, level: int = 3, diag_mode: int = 0)[source]#
Bases:
objectBSC + CMC Transforms encoder for contact matrix tiles.
Applies CMC’s domain-specific transforms (diagonal transform, binarization) before BSC entropy coding. Combines BSC’s speed with CMC’s structured transforms.
- Parameters:
Examples
- encode_tile(mat: ndarray) bytes[source]#
Encode a single contact matrix tile.
- Parameters:
mat (np.ndarray) – 2D contact matrix tile (upper triangular).
- Returns:
Compressed bitstream (shape info + encoded data).
- Return type:
Examples
- class gunz_cm.compressions.BscDecoder(tile_size: int = 512, resolution: int = 50000, dtype: ~numpy.dtype = <class 'numpy.uint32'>)[source]#
Bases:
objectBSC decoder for contact matrix tiles.
Uses bsc CLI subprocess for true BSC (Block Sorting Compression) decompression.
- Parameters:
Examples
- class gunz_cm.compressions.BscEncoder(tile_size: int = 512, resolution: int = 50000, level: int = 3)[source]#
Bases:
objectBSC encoder for contact matrix tiles.
Uses bsc CLI subprocess for true BSC (Block Sorting Compression).
- Parameters:
Examples
- encode_tile(mat: ndarray) bytes[source]#
Encode a single contact matrix tile.
- Parameters:
mat (np.ndarray) – 2D contact matrix tile.
- Returns:
BSC-compressed bitstream.
- Return type:
Examples
- class gunz_cm.compressions.CmcDecoder(tile_size: int = 256, resolution: int = 50000, diag_transform: bool = True)[source]#
Bases:
objectCMC decoder for contact matrix tiles.
- Parameters:
Examples
- class gunz_cm.compressions.CmcEncoder(tile_size: int = 256, resolution: int = 50000, diag_transform: bool = True)[source]#
Bases:
objectCMC encoder for contact matrix tiles.
- Parameters:
Examples
- encode_tile(mat: ndarray) bytes[source]#
Encode a single contact matrix tile.
- Parameters:
mat (np.ndarray) – 2D contact matrix tile (upper triangular).
- Returns:
CMC-encoded bitstream.
- Return type:
Examples
- class gunz_cm.compressions.CmcZstdDecoder(tile_size: int = 256, resolution: int = 50000, dtype: ~numpy.dtype = <class 'numpy.uint32'>)[source]#
Bases:
objectCMC Transforms + Zstd decoder for contact matrix tiles.
Uses Zstd decompression then reverses CMC’s domain-specific transforms.
- Parameters:
Examples
- class gunz_cm.compressions.CmcZstdEncoder(tile_size: int = 256, resolution: int = 50000, level: int = 3)[source]#
Bases:
objectCMC Transforms + Zstd encoder for contact matrix tiles.
Uses CMC’s domain-specific transforms (diagonal transform, binarization) with Zstd entropy coding for better compression and faster decode.
- Parameters:
Examples
- encode_tile(mat: ndarray) bytes[source]#
Encode a single contact matrix tile.
- Parameters:
mat (np.ndarray) – 2D contact matrix tile (upper triangular).
- Returns:
Compressed bitstream (shape info + encoded data).
- Return type:
Examples
- class gunz_cm.compressions.ZstdDecoder(tile_size: int = 256, resolution: int = 50000, dtype: ~numpy.dtype = <class 'numpy.uint32'>, use_zstd: bool = True)[source]#
Bases:
objectZstd decoder for contact matrix tiles.
- Parameters:
Examples
- class gunz_cm.compressions.ZstdEncoder(tile_size: int = 256, resolution: int = 50000, level: int = 3, use_zstd: bool = True)[source]#
Bases:
objectZstd encoder for contact matrix tiles.
- Parameters:
Examples
- encode_tile(mat: ndarray) bytes[source]#
Encode a single contact matrix tile.
- Parameters:
mat (np.ndarray) – 2D contact matrix tile.
- Returns:
Compressed bitstream.
- Return type:
Examples