Skip to content

ligand_binding_affinity values are incorrectly parsed! #94

@OleinikovasV

Description

@OleinikovasV

In bindingDB documentation the value Ligand HET ID in PDB and PDB ID(s) for Ligand-Target Complex are defined as follows:

Ligand HET ID in PDB. If available, the hetero group ID for this ligand in the PDB.
PDB ID(s) for Ligand-Target Complex. If available. Criterion for protein match is 85% sequence
identity of protein and exact match for ligand.

This maps the ligand HET ID to PDBs that it appears (sometimes in several different protein complexes, including binding binding site mutants). Thus, the assigned value does not mean that the reported assay value refers to this exact PDB complex.
Unfortunately, plinder pipeline code use these PDB IDs provided in PDB ID(s) for Ligand-Target Complex. to match the binding affinity values, without further checks on the sequence:
https://github.com/plinder-org/plinder/blob/main/src/plinder/data/pipeline/transform.py#L161

As a result, the system_has_binding_affinity and ligand_binding_affinity values are incorrect and should not be used!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions