Skip to content

Mach-1: A long-context RNA foundation model for predicting transcriptome architecture. Training pipeline and model weights.

License

Notifications You must be signed in to change notification settings

goodarzilab/mach-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mach-1

Mach-1 is a long-context RNA foundation model for predicting transcriptome architecture. This repository contains the training pipeline and model weights.

Repository Structure

lorna-sh
├── processing-seqs/          # Data preprocessing scripts
│   ├── lornash_tokenizer.json    # Tokenizer configuration
│   ├── prepare_data.R            # Data preparation script
│   └── tokenize_data.py          # Sequence tokenization script
├── training-model/          # Model training scripts
│   ├── configuration_hyena.py     # Model configuration
│   ├── generate_seqs.py          # Sequence generation script
│   ├── get_embeddings.py         # Embedding extraction
│   ├── get_likelihoods.py        # Likelihood computation
│   ├── lornash_dependencies.sh    # Dependencies installation
│   ├── modeling_hyena.py         # Core model architecture
│   └── train_model.py            # Training script
└── weights/                # Model weights
    └── model.safetensors       # Trained model weights

About

Mach-1: A long-context RNA foundation model for predicting transcriptome architecture. Training pipeline and model weights.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published