-
Notifications
You must be signed in to change notification settings - Fork 7
WIP Jameshennessy/unittests #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jamesthesnake
wants to merge
62
commits into
OpenBioML:main
Choose a base branch
from
pinellolab:jameshennessy/unittests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ions for the DHS Index data
…ions for the DHS Index data
Fixes formatting in the 'Tasks and Potential Roadmap' section
-add new scatter plot on the kl divergence data - The new metrics is working now (The sampled final sequences were mixed with noised versions previously )
Resolve contributor list in an independant file
Linked the gdoc sheet to keep this list updated based on what we discussed at the last meeting.
Minor formatting
Add discord link for people interested in joining
Sahu, B., Hartonen, T., Pihlajamaa, P. et al. Sequence determinants of human gene regulatory elements. Nat Genet 54, 283–294 (2022). https://doi.org/10.1038/s41588-021-01009-4
…fusion into dna-diffusion
https://huggingface.co/blog/annotated-diffusion) - MILESTONE -> Conditioning is tested and working (check diagonal plot after training to see the results) (Thanks Zach ) - Training using 1000 sequences is enought to see the results after ~1000 epochs (kl divergence going down to train and test sequences and increascing to shuffled sequences) - Added EMA to smooth the training. - Added a function to deal with dataload, motifs generation and training creation - New metrics added KL divergence tested agains (test, train, train shuffled) sequences and per component heatmap plot. - Added a simple loss visualization using livelossplot. - Refactored several functions to plot motif metrix ( others are inside the dataloader new class) Contributions Lucas Ferreira da Silva Luca Pinello Zach Nussbaum
Adding the modified NB
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.