gunz_cm.samplers package
Submodules
gunz_cm.samplers.spatial module
Samplers for spatial data locality optimization.
Examples
- class gunz_cm.samplers.spatial.SpatialBatchSampler(dataset_index: DataFrame, batch_size: int, block_size: int = 128, shuffle: bool = True, drop_last: bool = False)[source]
Bases:
Sampler[List[int]]A BatchSampler that yields mini-batches ordered by spatial proximity to maximize cache hits in compressed files.
To maintain randomness for SGD, it performs ‘Block Shuffling’: 1. Sorts the dataset spatially. 2. Groups indices into ‘mega-blocks’ (e.g., 50-100 samples). 3. Shuffles the order of mega-blocks. 4. Yields sequential mini-batches from within each mega-block.
Examples
Module contents
- class gunz_cm.samplers.SpatialBatchSampler(dataset_index: DataFrame, batch_size: int, block_size: int = 128, shuffle: bool = True, drop_last: bool = False)[source]
Bases:
Sampler[List[int]]A BatchSampler that yields mini-batches ordered by spatial proximity to maximize cache hits in compressed files.
To maintain randomness for SGD, it performs ‘Block Shuffling’: 1. Sorts the dataset spatially. 2. Groups indices into ‘mega-blocks’ (e.g., 50-100 samples). 3. Shuffles the order of mega-blocks. 4. Yields sequential mini-batches from within each mega-block.
Examples