Skip to content

Unable to standardize some PubChem molecules #39

@VladislavChernykh

Description

@VladislavChernykh

Hello,

I was using molvs standardizer on PubChem molecules and found out several molecules that cannot be standardized:

  1. SMILES: CC(S(=O)CC1=CC=C(C=C1)C(S(=O)CC2=CC=C(C=C2)C(S(=O)CC3=CC=C(C=C3)C(S(=O)C4=CC=C(C=C4)Br)S(=O)C5=CC=C(C=C5)Br)S(=O)CC6=CC=C(C=C6)C(S(=O)C7=CC=C(C=C7)Br)S(=O)C8=CC=C(C=C8)Br)S(=O)CC9=CC=C(C=C9)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br

Link: https://pubchem.ncbi.nlm.nih.gov/compound/59827358

  1. SMILES: CC1=CC=C(C=C1)C(S(=O)CC2=CC=C(C=C2)C(S(=O)CC3=CC=C(C=C3)C(S(=O)CC4=CC=C(C=C4)C(S(=O)C5=CC=C(C=C5)Br)S(=O)C6=CC=C(C=C6)Br)S(=O)CC7=CC=C(C=C7)C(S(=O)C8=CC=C(C=C8)Br)S(=O)C9=CC=C(C=C9)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br

Link: https://pubchem.ncbi.nlm.nih.gov/compound/59827349

Code to reproduce:

from rdkit import Chem
from molvs import Standardizer

smiles = "CC1=CC=C(C=C1)C(S(=O)CC2=CC=C(C=C2)C(S(=O)CC3=CC=C(C=C3)C(S(=O)CC4=CC=C(C=C4)C(S(=O)C5=CC=C(C=C5)Br)S(=O)C6=CC=C(C=C6)Br)S(=O)CC7=CC=C(C=C7)C(S(=O)C8=CC=C(C=C8)Br)S(=O)C9=CC=C(C=C9)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br)S(=O)CC1=CC=C(C=C1)C(S(=O)C1=CC=C(C=C1)Br)S(=O)C1=CC=C(C=C1)Br"
mol = Chem.MolFromSmiles(smiles)
res = Standardizer().standardize(mol)

It seems that the flow goes into an infinite loop in function _apply_transform() (normalize.py). After 10 minutes of transformation still got no result.

Thanks,
Vladislav

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions