ipSAE ranking, Dockerfile upgrade, nipah Glycoprotein G design support#86
ipSAE ranking, Dockerfile upgrade, nipah Glycoprotein G design support#86lhallee wants to merge 20 commits into
Conversation
|
Hi @lhallee, thank you very much for that! I was wondering, where exactly do you find the full PAE matrices? I tried to trace them from your edits on some test runs of mine, but I see no full PAE matrices in these npz files, only aggregating ones eg interaction_pae. |
|
Hey @zehanort, If I understand the boltzgen code correctly, there is a Notably, the ipSAE is the "min" ipSAE version which was found to have the highest discriminative power in the meta analysis linked above. |
|
Hi together, the easiest place to make these edits should be in the confidence_utils.py If you add it there (e.g. similar to how the iiptm score is computed there) I would add it to the repo. |
|
@lhallee Thanks for the great work! I made a fork of the original ipSAE repo to make interfacing with the scores easier from other Python code. Feel free to use this instead of your subprocessing method. The results in If designing a binder against a multi-chain target, I also think it makes sense to group all target chains into one during the ipSAE calculations. But for now it's just a hypothesis that I haven't had the chance to test yet. |
Hello @HannesStark,
The Nipah Binder Competition is ongoing, which leverages ipSAE as the main metric for ranking results. I wanted to add some easy functionality so that boltzgen naturally returns pae in an easy to read format, and a script calc_ipsae.py that automatically scores and ranks all of the final outputs of the model.
I also added an enhanced Dockerfile, which fixes a python version bug and makes sure the user is setup with up to date pytorch, etc.
Added to the examples is a simple workflow for designing Nipah Glycoprotein G binders, with the solved PDB structure leveraged.
To design competition binders with docker
Then, rank by ipSAE
Which places a summary here:
<output_dir>/designs_ranked_by_ipsae_summary.csvand fasta here:
<output_dir>/designs_ranked_by_ipsae.fastaRan the workflow above with
--num_designs 128and got some decent results. Looks like a high ipSAE outputs from boltzgen are fairly sparse but obviously possible!Obviously feel free to reorganize and use the contributed code however you'd like.
Best,
Logan