This repository contains the official implementation of "Multimodal Drug Recommendation with Quantum Chemical Molecular Representations".
QUARK is a multimodal medication recommendation framework that integrates quantum-chemical molecular representations with longitudinal patient EHR data to generate safe and clinically relevant drug combinations.
Unlike prior approaches that rely only on molecular structure or predefined interaction graphs, QUARK encodes electron-level physicochemical properties using ELF (Electron Localization Function) and ESP (Electrostatic Potential) maps derived from density functional theory (DFT). These quantum-informed drug embeddings are fused with patient representations through a cross-attention mechanism, enabling the model to jointly capture:
- molecular reactivity related to pharmacodynamic DDIs
- patient-specific clinical context
Highlights
- Quantum-informed Drug Representation: Utilizes DFT-derived ELF and ESP molecular maps to capture electron density, polarity, and intermolecular interaction patterns beyond atomic connectivity.
- Cross-modal Molecular Fusion: Applies multi-head cross-attention between ELF and ESP embeddings to model complementary physicochemical properties.
- Patient–Drug Context Matching: Computes drug relevance dynamically via compatibility between longitudinal patient states and quantum drug embeddings.
- Substructure-aware Pharmacological Modeling: Extends the SafeDrug bipartite formulation with patient-conditioned substructure importance estimation.
- Dual-level Evaluation Protocol: CID-level for DDI safety, and ATC3-level for evaluate therapeutic validity.
We provide:
- the full implementation of the QUARK architecture
- preprocessing and training pipelines for EHR data and quantum molecular representations
- evaluation scripts following the dual-level safety–effectiveness protocol
The overall data preprocessing and DDI construction pipeline follows our previous implementation in MMM for reproducibility and fair comparison.
Experiments are conducted on MIMIC-III, containing:
- 5,413 patients
- 14,057 visits
- 250 medications
- 4,918 DDI pairs
We adopt the same cohort construction and medication filtering strategy as in MMM.
Quantum molecular images are generated once per drug and reused during both training and inference.
Workflow:
-
SMILES → 3D geometry
Avogadro -
DFT calculation
ORCA (B3LYP / def2-SVP) -
ELF & ESP map extraction
Multiwfn
This preprocessing step is performed offline and does not affect inference-time latency.
The molecular preprocessing scripts are based on the MMM pipeline and extended to support dual-modality (ELF + ESP) inputs.
We use modality-specific pretrained image encoders:
- ELF encoder: EfficientNet-V2-L
- ESP encoder: ResNet-18
All experiments were conducted with:
Python 3.9
PyTorch 2.3.0
CUDA 11.8
To further train the model, you need to install RDKit-related tools and several packages. To avoid version conflicts among these packages, please follow the installation steps in the exact order below.
- First, create and activate a new conda environment.
conda create -c conda-forge -n new_env python=3.9
conda activate new_env
- Install RDKit
conda install -c conda-forge rdkit
- If RDKit does not work after the above installation, try:
pip install rdkit-pypi
- Install numpy, pandas, and scipy with specific versions to avoid conflicts:
pip install numpy==1.22.4 pandas==1.3.0 scipy==1.13.1
- To install PyTorch 2.3.0 with CUDA 11.8 support and torchvision 0.18.0 matching CUDA version, run::
pip install torch==2.3.0+cu118 torchvision==0.18.0+cu118 torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
Data paths and hyperparameters (such as learning rate, target_ddi, etc.) are configured in the main.py file. Dataset paths should be set in the code to correctly correspond to the input and output folders. Running the preprocessing script processing.py automatically generates related files within the output folder. The paths in the code must then be updated accordingly to reflect the locations of these generated files.
- Dataset Configuration
In main.py the paths for the following variables must be updated to correspond to the .pkl files generated within the output folder:
data_path = "[records_final.pkl]"
voc_path = "[voc_final.pkl]"
ddi_adj_path = "[ddi_A_final.pkl]"
ddi_mask_path = "[ddi_mask_H.pkl]"
molecule_path = "[cidtoSMILES.pkl]"
ddi_rate = ddi_rate_score("[ddi_A_final.pkl]")
These should be set to point to the corresponding .pkl files generated by preprocessing, typically located in the data/output folder.
- External data files required for preprocessing
The following files are obtained from external sources and must be prepared in advance:
| Filename | Description |
|---|---|
| ndc2RXCUI.txt | NDC-to-RxCUI mapping file, adapted from ndc2rxnorm_mapping.csv in the GAMENet repository. |
| drug-DDI.csv | Contains drug–drug interaction (DDI) information indexed by CID. Download from Google Drive. |
| RXCUI2atc4.csv | RxCUI-to-ATC4 mapping file, adapted from ndc2atc_level4.csv in the GAMENet repository. |
Hyperparameters can be configured in main.py. These hyperparameters are set using the argparse module, allowing default values to be specified and overridden via command-line arguments:
hyperparameters = {
"Test": [True or False],
"model_name": ["model_identifier"],
"resume_path": ["path/to/checkpoint"],
"lr": [learning_rate],
"target_ddi": [target_ddi],
"kp": [coefficient_of_P_signal],
"dim": [dimension_size],
"cuda": [cuda_device_index]
}
- Run the Code
python main.py
python main.py --Test --resume_path [best_epoch_path]
If you find this code useful for your work, please cite the following and consider starring this repository:
@inproceedings{kim2026quark,
title = {Multimodal Drug Recommendation with Quantum Chemical Molecular Representations},
author = {Yujin Kim and Seoeun Park and Chongmyung Kwon and Charmgil Hong},
booktitle = {Proceedings of the 31st International Conference on Database Systems for Advanced Applications (DASFAA 2026)},
year = {2026},
publisher = {Springer},
note = {To appear}
}
@inproceedings{yang2021safedrug,
title = {SafeDrug: Dual Molecular Graph Encoders for Safe Drug Recommendations},
author = {Yang, Chaoqi and Xiao, Cao and Ma, Fenglong and Glass, Lucas and Sun, Jimeng},
booktitle = {Proceedings of the Thirtieth International Joint Conference on
Artificial Intelligence, {IJCAI} 2021},
year = {2021}
}
@inproceedings{
kwon2025mmm,
title={{MMM}: Quantum-Chemical Molecular Representation Learning for Personalized Drug Recommendation},
author={Chongmyung Kwon and Yujin Kim and Seoeun Park and Yunji Lee and Charmgil Hong},
booktitle={PRedictive Intelligence in MEdicine},
year={2025},
organization={Springer}
}
