Releases · NVIDIA-Digital-Bio/nvMolKit

13 May 23:27

scal444

v0.5.0

849e340

v0.5.0 Latest

Latest

0.5.0 - 2026-05-13

Summary

nvMolKit 0.5.0 adds three new GPU-accelerated APIs: Torsion Fingerprint Deviation (TFD), pairwise conformer RMSD, and UFF force field optimization. It also introduces a BatchedForcefield Python API for MMFF and UFF with constraints, custom options, and multi-conformer minimization; a low-memory fused Butina clustering path that avoids the O(N²) distance matrix; a Python autotuning framework for the main APIs; and optional device-side output for ETKDG and forcefield optimization. Blackwell / L-class GPUs (including sm_103/B300) are now supported

Contributors

Kevin Boyd (@scal444)
Eva Xue (@evasnow1992)
Alireza Moradzadeh (@moradza)
Andrei Volgin (@volgin)

Features

GPU-accelerated Torsion Fingerprint Deviation (TFD) for batch all-pairs conformer comparison (#71)
GPU-accelerated pairwise conformer RMSD matrix computation by @volgin
GPU-accelerated UFF force field, supporting all options that the new BatchedForcefield Python API provides for MMFF (#114)
New BatchedForcefield Python API exposing per-molecule control over forcefield minimization (MMFF or UFF), and through it custom MMFF optimization options (max iterations, energy/gradient tolerances, non-bonded cutoff) (#70)
Distance and position constraints on forcefield optimization (MMFF and UFF) (#26)
Multi-conformer minimization in the BatchedForcefield API
HardwareOptions support for MMFF minimization, matching the ETKDG hardware-targeting API
Device-side output for ETKDG and forcefield optimization, allowing GPU tensors to flow between nvMolKit calls without round-tripping through host memory (#140)
Python autotuning library for the main APIs (nvmolkit.autotune), including ETKDG, forcefield optimization, and substructure search, with configuration serialization (#141)
Low-memory fused Butina clustering that computes Tanimoto similarities on the fly with Triton-backed kernels, avoiding the O(N²) distance matrix and enabling clustering of larger fingerprint datasets on a single GPU (#110)
Support for Blackwell and L-class GPUs, including sm_103 SASS for B300

Bug Fixes

Fix latent stream-ordering bug in the MMFF/BFGS minimizer that could race with subsequent operations (#172)
Fix int32 overflow in substructure pair indexing for batches where numTargets * numQueries exceeds INT32_MAX, which previously caused out-of-bounds writes in hasSubstructMatch and countSubstructMatches (#169)
Fix shared-memory overflow in the substructure recursive preprocessor caused by an incorrect config setting (#98)
Fix empty result handling in substructure search with uniquify when all matches were already unique (#112)

Miscellaneous

pip wheel distribution pipeline (pip install nvmolkit) with manylinux_2_28 wheels for CPython 3.11-3.14 (#15)
RDKit support range is now 2025.03.1 through 2026.03.1
Validate batchesPerGpu in HardwareOptions so every consumer gets a clean ValueError instead of a cryptic C++ error from the MMFF / ETKDG layer (#103)
Validate neighborlist_max_size in butina() before reaching the GPU (#104)
Validate MMFF atom types up front and report every failing molecule instead of hitting a PRECONDITION assertion mid-batch (#106)

Contributors

volgin, evasnow1992, and 2 other contributors

Assets 34

nvmolkit-0.5.0+rdkit2025.3.6-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:5e55a7c128592c7443c1e3d81e563fb4ced43aec0871245fbc981d694aa84528

68.2 MB 2026-05-20T14:22:14Z
nvmolkit-0.5.0+rdkit2025.3.6-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:50bf236340cebe1a89a877c792599aa880445c825ff2d5f2f3923102cdd1f951

68.2 MB 2026-05-20T14:22:15Z
nvmolkit-0.5.0+rdkit2025.3.6-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:cf87fdbeebe9c52f3618cd6ba369a7cee39a1f38b3ec868f6cad0dd4dd24356a

68.2 MB 2026-05-20T14:22:14Z
nvmolkit-0.5.0+rdkit2025.3.6-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:64f24ff460455708c9e09d5fc023525c84cee845d4da398053df12e553ad4499

68.2 MB 2026-05-20T14:22:15Z
nvmolkit-0.5.0+rdkit2025.9.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:d15bab450dd74a9e96e84e4539dfe5154d267d90a697e900f8be5e71cfbf1e54

68.3 MB 2026-05-20T14:22:36Z
nvmolkit-0.5.0+rdkit2025.9.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:6901ac125da1374d1ab5e29cdf1797269d6179b17240fb86bf75984ad265e1e0

68.3 MB 2026-05-20T14:22:53Z
nvmolkit-0.5.0+rdkit2025.9.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:b65328d5cac1d2233d3a57bf5186f0aefa39fd68206ef8932d5a75364d9e4a95

68.3 MB 2026-05-20T14:22:40Z
nvmolkit-0.5.0+rdkit2025.9.1-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:275a498a94440b5146a849631683086b6c53a32d2543a06381566b5b83768194

68.3 MB 2026-05-20T14:22:53Z
nvmolkit-0.5.0+rdkit2025.9.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:5b8b2c427e0139326d0eea81f9872abc60c3de02884d43b84f980eb582a5c453

68.3 MB 2026-05-20T14:22:20Z
nvmolkit-0.5.0+rdkit2025.9.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:d645b4cec69c2a0a173a669fac30c948b8c9beaa7fae90070fbf2dac602cd5bb

68.3 MB 2026-05-20T14:22:34Z
Source code (zip)

2026-05-13T23:23:24Z
Source code (tar.gz)

2026-05-13T23:23:24Z

20 Feb 19:55

evasnow1992

v0.4.0

bbccd30

Release v0.4.0

0.4.0 - 2026-02-23

Summary

nvMolKit 0.4.0 adds GPU-accelerated substructure searching, optional stream control across Python APIs, and enhancements to Butina clustering.

Contributors

Kevin Boyd (@scal444)
Eva Xue (@evasnow1992)

Features

GPU-accelerated substructure search with hasSubstructMatch, countSubstructMatches, and getSubstructMatches. Supports batch queries against batch targets with SMARTS-based query molecules.
Optional stream parameter added to fingerprint generation, similarity, and Butina clustering APIs, enabling explicit CUDA stream control
Butina clustering now supports optional centroid reporting via the return_centroids parameter (#82)
Butina clustering performance improved by replacing CPU loops with CUDA Graph conditional nodes (#72)

Bug Fixes

Fix data races when torch operations immediately followed nvMolKit calls on the default stream (Issue #84). Operations now correctly use the current stream or an explicit stream parameter (#36).
Fix setup.py compatibility on some Python versions and rework CUDA target detection (#68)

Contributors

evasnow1992 and scal444

Assets 2

11 Dec 20:53

scal444

v0.3.0

2adff1f

Release v0.3.0

0.3.0 - 2025-12-12

Summary

nvMolKit 0.3.0 adds Butina clustering support, improved performance to MMFF relaxation and conformer generation, and increased compatibility with libraries and compilers.

Contributors

Kevin Boyd (@scal444)
Eva Xue (@evasnow1992)
Xuangui Huang (@stslxg-nv)

Features

Butina clustering API enabled, using distance matrix input. On an H200 GPU, speedups of 400-1000x can be achieved on datasets up to 60k molecules
Improvements to BFGS minimizer. Up to 5x speedup compared to nvMolKit v0.2 on batches of small molecules (<20 atoms), with ~10-20% speedup in the general case. Applies to both MMFF relaxation and conformer generation.
Conda-forge releases now support RDKit versions 2024.9.6 to 2025.9.3

Bug Fixes

Fixed a bug where synchronizations on the wrong stream could lead to data races in tests (Issue #28)
Fixed several areas where a memcpy could go out of scope before completing (Issue #28, Issue #29)
Fixed a bug where ETKDG would exit early with small CPU counts due to an incorrect identification of resource mis-configuration (Issue #31)

Miscellaneous

(C++) Added support for CUB/CCCL > v2.8
(C++) Added support for externally specified CCCL
(C++) Added support for CUDA 13.0

Contributors

evasnow1992, scal444, and stslxg-nv

Assets 2

10 Oct 12:12

scal444

v0.2.0

3757bcb

nvMolKit v0.2.0

0.2.0 - 2025-10-24

Summary

nvMolKit 0.2.0 comes with significant usability and feature-completeness improvements to existing functionality. It is also
the first release to have a conda-forge release.

Contributors

Kevin Boyd (@scal444)
Eva Xue (@evasnow1992)
Ignacio Pickering (@IgnacioJPickering)

Features

Add memory-segmented cross-similarity code, enabling larger datasets on systems with limited GPU memory (#13)
Support conformer deduplication in ETKDG conformer generation (#14)
Allow molecules > 256 atoms in conformer generation and MMFF optimization (#16)
Enable all combinations of (ET)(K)(DG) in conformer generator (#17)

Bug Fixes

Fix compilation error on C++ build with target=native on Hopper architecture GPUs. (#6)
Fix lack of device-set cleanup in multi-GPU code (#8)
Fix bug in fingerprint bool->bitfield packing/unpacking code (#11)
Fix integer overflow leading to incorrect allocations in similarity calculation code. (#20)
Fix crash in most multithreaded APIs whenever exceptions are thrown inside of OpenMP loop. Exceptions now properly propagated to python (#18)

Miscellaneous

Removed unsupported Bulk Similarity APIs (#12)

Contributors

evasnow1992, scal444, and IgnacioJPickering

Assets 2

09 Sep 18:33

scal444

v0.1.0

063115a

v0.1.0

Initial release of nvMolKit

Assets 2

Releases: NVIDIA-Digital-Bio/nvMolKit

v0.5.0

0.5.0 - 2026-05-13

Summary

Contributors

Features

Bug Fixes

Miscellaneous

Contributors

Uh oh!

Release v0.4.0

0.4.0 - 2026-02-23

Summary

Contributors

Features

Bug Fixes

Contributors

Uh oh!

Release v0.3.0

0.3.0 - 2025-12-12

Summary

Contributors

Features

Bug Fixes

Miscellaneous

Contributors

Uh oh!

nvMolKit v0.2.0

0.2.0 - 2025-10-24

Summary

Contributors

Features

Bug Fixes

Miscellaneous

Contributors

Uh oh!

v0.1.0

Uh oh!