Add Python build/pytest CI workflow on colossus runner#167
Conversation
Mirrors the cpp-build-test workflow's container + runner setup, then builds the Python package via pip + scikit-build and runs the pytest suite (excluding the `long` substructure integration tests, matching pip-build.yml's smoke job). Single matrix entry pinned to py3.13 / rdkit 2025.9.2, the latest CUDA 12.x pair tracked in conda-build.yml.
|
| Filename | Overview |
|---|---|
| .github/workflows/python-build-test.yml | New CI workflow that source-builds nvmolkit and runs pytest on a colossus V100 runner; structure mirrors the existing C++ workflow with minor inconsistencies in shell flags. |
Reviews (3): Last reviewed commit: "Drop CI matrix back to py3.12 for V100 r..." | Re-trigger Greptile
| python -m pip install \ | ||
| "scikit-build>=0.18" \ | ||
| "numpy>=1.23" \ | ||
| "torch>=2.1" \ | ||
| triton \ | ||
| pandas \ | ||
| psutil \ | ||
| optuna |
There was a problem hiding this comment.
triton is specified without a version pin, but Triton releases are tightly coupled to specific PyTorch versions. With torch>=2.1 as a very wide lower bound, pip could resolve a Triton version that is incompatible with whichever PyTorch version is actually installed, leading to import failures at test time. Pinning Triton alongside torch (or at least bounding it with something like triton>=2.1,<4) mirrors how the PyTorch project distributes matching Triton builds.
| python -m pip install \ | |
| "scikit-build>=0.18" \ | |
| "numpy>=1.23" \ | |
| "torch>=2.1" \ | |
| triton \ | |
| pandas \ | |
| psutil \ | |
| optuna | |
| python -m pip install \ | |
| "scikit-build>=0.18" \ | |
| "numpy>=1.23" \ | |
| "torch>=2.1" \ | |
| "triton>=2.1" \ | |
| pandas \ | |
| psutil \ | |
| optuna |
There was a problem hiding this comment.
@greptileai isn't that not true because whatever torch we bring in will pin triton?
There was a problem hiding this comment.
You're right. PyTorch declares triton as a pinned dependency in its package metadata (e.g., torch==2.6.0 requires triton==3.2.0), so pip's dependency resolver will install the exact triton version that the resolved torch requires. The explicit triton entry in your requirements just resolves to whatever version satisfies torch's constraint — there's no incompatibility risk. My concern was unfounded.
Tip: You can give me more context with [.greptile/rules.md](https://www.greptile.com/docs/code-review/custom-standards) so I get this right next time.
PyPI torch wheels from 2.5+ dropped sm_70 from their default arch list, which produces cudaErrorNoKernelImageForDevice on the colossus V100.
py3.13 torch wheels start at 2.5, which dropped sm_70 from default arches. Use py3.12 + rdkit 2025.3.6 instead.
The default PyPI torch wheel ships an older bundled libcudart that predates cudaGraphAddNode (added in CUDA 12.4), and the cu128/cu129 channels dropped sm_70.
|
Greptile encountered an error while reviewing this PR. Please reach out to support@greptile.com for assistance. |
Builds the Python package and runs pytest on a single matrix entry (py3.13 / rdkit 2025.9.2). Source build for now; in principle the conda artifact from conda-build.yml could be reused, but cross-workflow artifact passing is awkward.