FastPLMs

FastPLMs is an open-source effort to increase the efficiency of pretrained protein language models, switching out native attention implementations for Flash or Flex attention.

All models can be loaded from Huggingface 🤗 transformers via AutoModel, this repository does not need to be cloned for most use cases.

Supported models

The currently supported models can be found here.

Suggestions

Have suggestions, comments, or requests? Found a bug? Open a GitHub issue and we'll respond soon.

Low code embedding of protein datasets

embeddings = model.embed_dataset(
    sequences=sequences, # list of protein strings
    batch_size=16, # embedding batch size
    max_len=2048, # truncate to max_len
    full_embeddings=True, # return residue-wise embeddings
    full_precision=False, # store as float32
    pooling_type='mean', # use mean pooling if protein-wise embeddings
    num_workers=0, # data loading num workers
    sql=False, # return dictionary of sequences and embeddings
)

_ = model.embed_dataset(
    sequences=sequences, # list of protein strings
    batch_size=16, # embedding batch size
    max_len=2048, # truncate to max_len
    full_embeddings=True, # return residue-wise embeddings
    full_precision=False, # store as float32
    pooling_type='mean', # use mean pooling if protein-wise embeddings
    num_workers=0, # data loading num workers
    sql=True, # store sequences in local SQL database
    sql_db_path='embeddings.db', # path to .db file of choice
)

Note

The ANKH implementation is in progress. It is functional but is still currently native attention.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
readmes		readmes
tests		tests
wip/t5		wip/t5
.gitignore		.gitignore
README.md		README.md
fine_tuning_example.py		fine_tuning_example.py
inference.ipynb		inference.ipynb
modeling_esm_plusplus.py		modeling_esm_plusplus.py
modeling_fastesm.py		modeling_fastesm.py
requirements.txt		requirements.txt
update_HF.py		update_HF.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastPLMs

Supported models

Suggestions

Low code embedding of protein datasets

Note

About

Releases

Packages

Languages

nrafaili/FastPLMs

Folders and files

Latest commit

History

Repository files navigation

FastPLMs

Supported models

Suggestions

Low code embedding of protein datasets

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages