fix(nightly): install numpy for ROS2 bridge embedded Python test#1894
Merged
Conversation
Contributor
|
😎 Merged successfully - details. |
Closes the ROS2 Bridge Examples failure that remained red after #1881 unblocked the cross-check matrix. The other failing nightly job (`CLI Tests (macos-latest)` python-dataflow SIGTERM) is NOT addressed here — see #1882 for the deeper investigation now needed. The ROS2 fix ============ `typed::tests::test_python_array_code` at `libraries/extensions/ros2-bridge/python/src/typed/mod.rs:58` loads the embedded Python fixture at `libraries/extensions/ros2-bridge/python/test_utils.py:3`, which does `import numpy as np`. The `ros2-bridge` nightly job at `.github/workflows/nightly.yml` installs `pip install pyarrow` before `cargo test -p dora-ros2-bridge-python` but never installs numpy. Result: `ModuleNotFoundError: No module named 'numpy'`. One-line fix: - run: pip install pyarrow + run: pip install pyarrow numpy Why the SIGTERM issue was reverted from this PR ================================================ The initial commit on this branch added a `node.drain()` STOP-poll to `examples/python-dataflow/sender.py`. Code review surfaced that the change did not actually make the nightly test pass — re-running the exact CI invocation (`dora run examples/python-dataflow/dataflow.yml --uv --stop-after 10s`) still produced ExitCode(143) for all three nodes. Subsequent local investigation on macos-aarch64 confirmed the SIGTERM bug is deeper than the sender pattern: * `receiver.py` and `transformer.py` already use the canonical `for event in node:` loop with explicit `break` on STOP. Both still report ExitCode(143). * A trivial sender (10 messages × 100ms = ~1s natural runtime) with `--stop-after 5s` still produces ExitCode(143) on all three nodes 10s after the soft-stop is sent. * The dora daemon code at `running_dataflow.rs:361-367` does send `NodeEvent::Stop` through each node's subscribe channel before the 10-second SIGTERM grace window. But adding debug prints to `receiver.py` shows nothing reaches stdout between "starting" and the eventual SIGTERM-triggered flush, suggesting either: - `node = Node()` is blocking longer than expected on macOS, or - The daemon's stdout-capture buffers prints until process exit, masking what the receiver was actually doing. Either way, the fix is on the dora-daemon side and out of scope for a one-line example tweak. #1882 stays open and a follow-up post on that thread documents the investigation trail. This PR is now scoped narrowly to the ROS2 numpy fix, which IS verifiable: the test fails today with `ModuleNotFoundError`, the one-line `pip install pyarrow numpy` change resolves it. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
0842178 to
5e73936
Compare
Collaborator
Author
|
/trunk merge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Scope reduced after review feedback. Original branch bundled a fix attempt for #1882 (CLI Tests macOS SIGTERM) alongside a ROS2 numpy install. Reviewer correctly pointed out that the sender.py poll fix didn't actually make the nightly test pass — re-running the exact CI invocation still produced ExitCode(143) on all three nodes. Subsequent local investigation found the bug is deeper than the sender's stop polling. Reverted that part; this PR is now scoped narrowly to the ROS2 numpy fix.
The fix (1 line)
typed::tests::test_python_array_codeatlibraries/extensions/ros2-bridge/python/src/typed/mod.rs:58loadslibraries/extensions/ros2-bridge/python/test_utils.py:3which doesimport numpy as np. Theros2-bridgenightly job installs only pyarrow:That's it.
What I learned about #1882 (out of scope, posting on the issue)
For anyone following along — the SIGTERM bug is NOT just the sender being fire-and-exit. With current main + a fresh
cargo build --release -p dora-cli, every variation reproduces:breakon STOP → all three exit 143--stop-after 30s(20s margin past sender's runtime) → all three still exit 143The daemon code at
running_dataflow.rs:361-367does sendNodeEvent::Stopthrough each node's subscribe channel before the 10-second SIGTERM grace. But debug prints added to receiver.py afternode = Node()never appear in the captured stdout, suggesting eitherNode()is blocking on macOS or the daemon's stdout-capture buffers until process exit. Either way the fix is daemon-side, not example-side. #1882 stays open for that.Test plan
python3 -c "import yaml; yaml.safe_load(open('.github/workflows/nightly.yml'))"— YAML OKgit diff --check --cachedcleanROS2 Bridge Examples🤖 Generated with Claude Code