Cherimoya¶
A lightweight genomic sequence-to-function model.
Cherimoya predicts genomic modalities — transcription factor binding, chromatin accessibility, and transcription initiation — from DNA sequence alone. It builds on concepts from BPNet and ChromBPNet while introducing architectural, algorithmic, and systems-level improvements for better stability, efficiency, and performance.
Under Active Development
Cherimoya is still evolving and may change in ways that are not backward compatible. Please note the version you are using.
Why Cherimoya?¶
While popular S2F models like BPNet and ChromBPNet have revolutionized our ability to interpret regulatory sequences, they often require millions of parameters and extensive tuning. Cherimoya provides a modern alternative:
Efficient Architecture: Uses significantly fewer parameters while maintaining or exceeding state-of-the-art predictive performance.
Speed: Runs much faster on modern GPUs (e.g., H200) thanks to custom Triton kernels that fuse dilated convolutions and layer normalization.
Automated Tuning: Replaces manual loss balancing heuristics with learned weighting parameters that adapt to your data’s signal-to-noise characteristics.
Modern Optimization: Leverages the Muon optimizer and dual-optimizer strategies to reduce training epochs and improve convergence.
—
Getting Started