-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add phosphorylated residues SEP and PTR (pSer and pTyr) #261
Conversation
Lots of tests are failing with errors like this:
With the original version, we manually removed the duplicates in #132 (see also #131). We could try to add logic to the script to automatically remove duplicates, but just doing it by hand again is probably simplest. |
I looked at the failure log and it looks like there were only four residues recognized as duplicates are causing the failures: LYN, CYM, GLH, and ASH. I've deleted these four residue templates like was done in #132 (but in #132 a different set of residues were deleted than here, interestingly). Hopefully this fixes things! |
There are a few others that need to be removed too: CYX, HID, HIE. Also rename HIP to HIS. (HID, HIE, and HIP are all variants of histidine which are identical after removing hydrogens.) Then for each residue you've removed, there are separate templates corresponding to N- and C- terminal residues. Their names are the same but with N or C at the beginning. The same changes need to be made to them. If you're not sure about something, take a look at the diff of #132. |
Ah I see! Didn't want to mess around and delete established residues so I wasn't sure. I've gone ahead and deleted CYX, HID, HIE, and renamed HIP->HIS. I then did the same but for N- and C-terminal versions of any of these residues (or the ones I deleted in the previous commit). I went through the diff at #132 and I think all the residues deleted there are now correspondingly removed from soft.xml here. I can't think of anything else that would cause issues? |
This all looks good now. I assume you've tested it on a file containing those residues to make sure it works? I would suggest adding one to |
Oh I haven't yet! I do have a structure of a kinase (attached below) and I'll test it on that...the only test I can think of at the moment is to load it in, protonate it, and then simulate it...anything else I should try out? |
That's fine. Just to make sure it's able to build missing residues of the new types correctly. |
Hm, looks like protonation is an issue... I just ran:
and it wrote out a PDB file without complaint, but SEP isn't protonated (every other residue IS protonated though). In retrospect this isn't surprising since I don't think I ever specified anything about protonation of SEP residues? Should the template of SEP I include just have the protons included by default? I didn't include hydrogens in SEP.pdb because I thought none of the template PDB files should be protonated? |
Adding hydrogens is done by OpenMM's Modeller class. It has its own internal specification of what hydrogens to add to what residues, and of course it doesn't include these two. You can call http://docs.openmm.org/latest/userguide/application/03_model_building_editing.html#adding-hydrogens And here's the file with the built in definitions. The format should hopefully be clear. https://github.com/openmm/openmm/blob/master/wrappers/python/openmm/app/data/hydrogens.xml This does raise an issue: this feature can't be fully implemented just by changes to PDBFixer. If you just need it for your own scripts, of course, it's fine to put a call to |
Ah! I see! Makes sense --- let me try using it and see if I can get it working using either the
Hrmm the intention of this whole addition was to have it be useful for some other folks besides me as well. Also, it is rather nice to be able to auto-build in the heavy atoms of phosphorylated residues into protein chains rather than using other closed-source tools like Maestro where atom names/orders may be incompatible, so I see no reason not to at least be able to include these new residues. How about a middle ground solution: Rather than modifying OpenMM perhaps we can just include an appropriate |
If I understand correctly, @sukritsingh's goal here was to permit users to use Would More generally, we might want to make |
True, we could make make it automatically load its own definitions. |
Update: I've gotten it to to load the hydrogen definitions (I think) ok, but I'm encountering an issue about bonds being different? Mutation from SER to SEP works when I run the following:
I then load in a
However, when I run the final
Correct me if I'm wrong, but does this mean that the SEP listed in the |
Ah! I think it's related to this comment above:
TLDR: Any advice on what to change so PDBFixer writes out the full In the example file I'm using for all this (attached below), if you you mutate SER->SEP using:
It does NOT write out the complete
What it should write out is (Edit: Found in
I think because it's non-standard it's only writing out the conect records for the standard part of the SEP residue. On the bright side, as soon as you use the correct Any insight on what changes to make to allow the correct CONECT records to be written? |
PDBFile decides which bonds to write based on the name of the residue: for atom1, atom2 in topology.bonds():
if atom1.residue.name not in PDBFile._standardResidues or atom2.residue.name not in PDBFile._standardResidues:
conectBonds.append((atom1, atom2)) If those bonds aren't being written, that suggests they aren't present in the topology. |
Ah ok! Just to clarify the code snippet, it seems like anything outside the standard residue set (ie things involving HETATOMS) would be written out as Would the topology information be contained with the residue record for SEP in
Could there be confusion somehow because half the SEP topology is a standard residue, and then there's a few more atoms attached, or that the topology isn't being updated properly? Behaviorally, it's almost like it created |
There's no such thing as a "half standard residue". A residue is either a heterogen or it isn't. Here's what the spec says about CONECT records:
Since SEP is not a standard amino acid, the entire residue is a heterogen and all bonds in it need to be listed. PDBFile decides what to write based on the residue name. Since 'SEP' is not contained in |
Ahh ok thanks for clarifying! Sorry for all the questions - just trying to parse what's going on.... It looks like, just as you mentioned, the entire residue (pre-protonation) is being written out as heterogens (with
It looks like all the atoms do get added, but the bonds aren't rebuilt, which would be happening in |
Add forcefield arg to addMissingHydrogens (Merge openmm#262 into fork)
It's from the template. That's the problem: your template doesn't have CONECT records. The existing templates don't need them because they're standard residues. SEP and PTR aren't, so they do need them. |
Ok great! Let me try adding the correct CONECT records into the template Thanks so much for all your help! |
You can print out all the bonds involving the residue with something like this: print([(a1, a2) for a1, a2 in fixer.topology.bonds() if 'SEP' in [a1.residue.name, a2.residue.name]]) Print that at each point in the script and see whether the residue contains the bonds you expect it to at that point. |
Finally returning to this (so sorry for the delay - I got buried in K99 grant writing right after that last post and it was my first time writing an NIH grant!) So I went through and ran through the above commands line by line and then ran: to write out the bonds after each step. The problem bonds are missing occurred right after I run:
I've pasted out the bonds below for completeness, but to summarize, it's only got the bonds of the original SER residue that was in that position. It does not seem to have any record of the added bonds when I run Should the bonds be recognized right after applying the mutation? I had assumed so but in looking at Any advice as to what to next try? Bonds output from the
This is the bonds output from running
|
Small update: Dug around a bit more and noticed that if I run: Then the
Interestingly, I also noticed that the The section for residue
with no mention of the new atoms. Given that |
This PR provides the following changes in reference to #259 :
pdbfixer/pdbfixer.py
is updated to add these residues as well.devtools/createSoftForcefield.py
for the current state ofatomType
objectspdbfixer/soft.xml
file using the newly updated scriptdevtools/createSoftForcefield.py