Cherimoya

PyPI Version Python 3.10+ License Cherimoya logo

A compact deep learning model for predicting genomic profile data from DNA sequence.

Cherimoya predicts genomic modalities — transcription factor binding, chromatin accessibility, and transcription initiation — directly from DNA sequence. It pairs a lightweight ConvNeXt-style backbone with custom Triton GPU kernels for both training and inference, and ships with an end-to-end CLI that takes BAM files through peak calling, training, attribution, and motif discovery in a single command.

Under Active Development

Cherimoya is still evolving and may introduce breaking changes between versions. Pin the version you train with if you need to reload checkpoints later.

Where to start

If a term is unfamiliar, Glossary defines everything used in the rest of these docs. If something is going wrong, start with Troubleshooting and FAQ.

Design highlights

  • Cheri Blocks. A dilated depthwise convolution fused with a per-example layer normalization and a channel-mixing MLP, implemented as a custom Triton kernel. The default 9-layer model is ~340K parameters with a 1115 bp receptive field.

  • Three forward paths, one set of weights. A CPU fallback, a Triton fwd+bwd kernel for training, and a fwd-only megakernel for inference, all numerically equivalent up to ~1e-5 max-abs.

  • Dual-optimizer training. Muon for 2D projection weights, AdamW for everything else, with hyperparameters tuned via large-scale sweeps.

  • Learned loss balancing. Kendall-Gal uncertainty weighting with two learnable scalars replaces a fixed profile/counts loss weight.

  • EMA at evaluation. An exponential moving average of the parameters is maintained during training and used at evaluation, smoothing both the validation curve and the final predictions.

  • Stability-first defaults. Small fixed residual scale at initialization, no biases inside Cheri Blocks, no weight decay on Muon-routed weights, and a 5-epoch warmup before cosine decay.

See Architecture for the full story and Benchmarks for measured numbers.