-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poses ordering in a sdf mk_exported from a dlg output. #311
Comments
Hi Saverio, in the latest version (v0.6.1) poses are sorted by score by default. In earlier versions, you can pass option |
Hi, Saverio |
Hi, Here what I' ve done for one .dlg to report the problem: $ cd sources/Meeko-develop/ Where I'm wrong? Thanks. Saverio Lemme |
With --all_dlg_poses they won't be ordered. By default, without passing --all_dlg_poses, mk_export.py exports the clusters leads which are sorted by autodock-gpu. If you don't pass this option, poses will be sorted. |
Hi, $ less DB14859_docking_res_ad4_sf.sdf | grep free_energy | awk '{ print $4 }' | wc -l Is possible to have an sdf file with all the poses in the dlg file and ordered? Thanks. Saverio |
Hi @xavgit Please see a minimal Pythin script that uses RDKit to do this: from rdkit import Chem
import json
input_sdf = "DB14859_docking_res_ad4_sf.sdf"
output_sdf = "DB14859_docking_res_ad4_sf_sorted.2.sdf"
def extract_free_energy(mol):
meeko_prop = mol.GetProp('meeko')
# Parse the property as JSON to extract "free_energy"
meeko_data = json.loads(meeko_prop)
return float(meeko_data.get('free_energy', float('inf')))
input_mols = [mol for mol in Chem.SDMolSupplier(input_sdf)]
sorted_mols = sorted(input_mols, key=extract_free_energy)
writer = Chem.SDWriter(output_sdf)
for mol in sorted_mols:
writer.write(mol)
writer.close() |
Hi, In the meantime I made a very basic script to solve the order problem, Thanks. Saverio |
Hi, thank you for your kind words. There are many ways to do the sorting (can use the tools in bash for text manipulation too). Closing this issue as resolved. But please feel free to re-open if you have any questions, thoughts and suggestions. |
Hi,
inspecting an sdf mk_exported from a dlg output I've noticed that the poses
are not ordered starting with that having the lowest value of the free energy.
In my case :
$ less ../docking_results_sdf/DB03966_docking_res_ad4_sf.sdf | grep free_energy
{"is_sidechain": [false], "free_energy": -6.22, "intermolecular_energy": -10.4, "internal_energy": -5.62, "cluster_size": 1, "cluster_id": 10, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.67, "intermolecular_energy": -10.85, "internal_energy": -5.2, "cluster_size": 2, "cluster_id": 8, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.76, "intermolecular_energy": -11.94, "internal_energy": -5.06, "cluster_size": 1, "cluster_id": 2, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.26, "intermolecular_energy": -11.44, "internal_energy": -5.63, "cluster_size": 3, "cluster_id": 4, "rank_in_cluster": 2}
{"is_sidechain": [false], "free_energy": -6.98, "intermolecular_energy": -11.16, "internal_energy": -5.2, "cluster_size": 1, "cluster_id": 5, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.55, "intermolecular_energy": -11.73, "internal_energy": -5.03, "cluster_size": 1, "cluster_id": 3, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -8.03, "intermolecular_energy": -12.2, "internal_energy": -4.2, "cluster_size": 1, "cluster_id": 1, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.27, "intermolecular_energy": -11.44, "internal_energy": -5.74, "cluster_size": 3, "cluster_id": 4, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.94, "intermolecular_energy": -11.12, "internal_energy": -4.87, "cluster_size": 1, "cluster_id": 6, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.71, "intermolecular_energy": -10.89, "internal_energy": -5.41, "cluster_size": 1, "cluster_id": 7, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.35, "intermolecular_energy": -10.53, "internal_energy": -5.46, "cluster_size": 2, "cluster_id": 8, "rank_in_cluster": 2}
{"is_sidechain": [false], "free_energy": -6.03, "intermolecular_energy": -10.21, "internal_energy": -5.51, "cluster_size": 3, "cluster_id": 11, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -5.81, "intermolecular_energy": -9.99, "internal_energy": -5.59, "cluster_size": 3, "cluster_id": 11, "rank_in_cluster": 3}
{"is_sidechain": [false], "free_energy": -6.02, "intermolecular_energy": -10.19, "internal_energy": -5.52, "cluster_size": 3, "cluster_id": 11, "rank_in_cluster": 2}
{"is_sidechain": [false], "free_energy": -6.42, "intermolecular_energy": -10.6, "internal_energy": -5.65, "cluster_size": 1, "cluster_id": 9, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.12, "intermolecular_energy": -11.3, "internal_energy": -5.57, "cluster_size": 3, "cluster_id": 4, "rank_in_cluster": 3}
It is possible to add an option to mk_export.py to make the sdf having the poses
ordered starting from the lowest value of the free energy?
To have for example, not considering the sort -k4,4r, something like the following:
$ less ../docking_results_sdf/DB03966_docking_res_ad4_sf.sdf | grep free_energy | sort -k4,4r
{"is_sidechain": [false], "free_energy": -8.03, "intermolecular_energy": -12.2, "internal_energy": -4.2, "cluster_size": 1, "cluster_id": 1, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.76, "intermolecular_energy": -11.94, "internal_energy": -5.06, "cluster_size": 1, "cluster_id": 2, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.55, "intermolecular_energy": -11.73, "internal_energy": -5.03, "cluster_size": 1, "cluster_id": 3, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.27, "intermolecular_energy": -11.44, "internal_energy": -5.74, "cluster_size": 3, "cluster_id": 4, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -7.26, "intermolecular_energy": -11.44, "internal_energy": -5.63, "cluster_size": 3, "cluster_id": 4, "rank_in_cluster": 2}
{"is_sidechain": [false], "free_energy": -7.12, "intermolecular_energy": -11.3, "internal_energy": -5.57, "cluster_size": 3, "cluster_id": 4, "rank_in_cluster": 3}
{"is_sidechain": [false], "free_energy": -6.98, "intermolecular_energy": -11.16, "internal_energy": -5.2, "cluster_size": 1, "cluster_id": 5, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.94, "intermolecular_energy": -11.12, "internal_energy": -4.87, "cluster_size": 1, "cluster_id": 6, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.71, "intermolecular_energy": -10.89, "internal_energy": -5.41, "cluster_size": 1, "cluster_id": 7, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.67, "intermolecular_energy": -10.85, "internal_energy": -5.2, "cluster_size": 2, "cluster_id": 8, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.42, "intermolecular_energy": -10.6, "internal_energy": -5.65, "cluster_size": 1, "cluster_id": 9, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.35, "intermolecular_energy": -10.53, "internal_energy": -5.46, "cluster_size": 2, "cluster_id": 8, "rank_in_cluster": 2}
{"is_sidechain": [false], "free_energy": -6.22, "intermolecular_energy": -10.4, "internal_energy": -5.62, "cluster_size": 1, "cluster_id": 10, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.03, "intermolecular_energy": -10.21, "internal_energy": -5.51, "cluster_size": 3, "cluster_id": 11, "rank_in_cluster": 1}
{"is_sidechain": [false], "free_energy": -6.02, "intermolecular_energy": -10.19, "internal_energy": -5.52, "cluster_size": 3, "cluster_id": 11, "rank_in_cluster": 2}
{"is_sidechain": [false], "free_energy": -5.81, "intermolecular_energy": -9.99, "internal_energy": -5.59, "cluster_size": 3, "cluster_id": 11, "rank_in_cluster": 3}
Being used to Vina the first pose is the "best".
Thanks.
Saverio
The text was updated successfully, but these errors were encountered: