Skip to content

fix(nightly): install numpy for ROS2 bridge embedded Python test#1894

Merged
trunk-io[bot] merged 1 commit into
mainfrom
fix/nightly-sender-stop-and-ros2-numpy
May 20, 2026
Merged

fix(nightly): install numpy for ROS2 bridge embedded Python test#1894
trunk-io[bot] merged 1 commit into
mainfrom
fix/nightly-sender-stop-and-ros2-numpy

Conversation

@heyong4725
Copy link
Copy Markdown
Collaborator

@heyong4725 heyong4725 commented May 20, 2026

Summary

Scope reduced after review feedback. Original branch bundled a fix attempt for #1882 (CLI Tests macOS SIGTERM) alongside a ROS2 numpy install. Reviewer correctly pointed out that the sender.py poll fix didn't actually make the nightly test pass — re-running the exact CI invocation still produced ExitCode(143) on all three nodes. Subsequent local investigation found the bug is deeper than the sender's stop polling. Reverted that part; this PR is now scoped narrowly to the ROS2 numpy fix.

The fix (1 line)

typed::tests::test_python_array_code at libraries/extensions/ros2-bridge/python/src/typed/mod.rs:58 loads libraries/extensions/ros2-bridge/python/test_utils.py:3 which does import numpy as np. The ros2-bridge nightly job installs only pyarrow:

-        run: pip install pyarrow
+        run: pip install pyarrow numpy

That's it.

What I learned about #1882 (out of scope, posting on the issue)

For anyone following along — the SIGTERM bug is NOT just the sender being fire-and-exit. With current main + a fresh cargo build --release -p dora-cli, every variation reproduces:

  • Original sender (range(100) × sleep(0.1)) + receiver/transformer with proper break on STOP → all three exit 143
  • Sender trimmed to range(10) (~1s natural runtime, exits well before --stop-after) → all three still exit 143
  • --stop-after 30s (20s margin past sender's runtime) → all three still exit 143

The daemon code at running_dataflow.rs:361-367 does send NodeEvent::Stop through each node's subscribe channel before the 10-second SIGTERM grace. But debug prints added to receiver.py after node = Node() never appear in the captured stdout, suggesting either Node() is blocking on macOS or the daemon's stdout-capture buffers until process exit. Either way the fix is daemon-side, not example-side. #1882 stays open for that.

Test plan

  • python3 -c "import yaml; yaml.safe_load(open('.github/workflows/nightly.yml'))" — YAML OK
  • git diff --check --cached clean
  • Next nightly green on ROS2 Bridge Examples

🤖 Generated with Claude Code

@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented May 20, 2026

😎 Merged successfully - details.

Closes the ROS2 Bridge Examples failure that remained red after #1881
unblocked the cross-check matrix. The other failing nightly job
(`CLI Tests (macos-latest)` python-dataflow SIGTERM) is NOT addressed
here — see #1882 for the deeper investigation now needed.

The ROS2 fix
============

`typed::tests::test_python_array_code` at
`libraries/extensions/ros2-bridge/python/src/typed/mod.rs:58`
loads the embedded Python fixture at
`libraries/extensions/ros2-bridge/python/test_utils.py:3`, which
does `import numpy as np`. The `ros2-bridge` nightly job at
`.github/workflows/nightly.yml` installs `pip install pyarrow`
before `cargo test -p dora-ros2-bridge-python` but never installs
numpy. Result: `ModuleNotFoundError: No module named 'numpy'`.

One-line fix:

    -        run: pip install pyarrow
    +        run: pip install pyarrow numpy

Why the SIGTERM issue was reverted from this PR
================================================

The initial commit on this branch added a `node.drain()` STOP-poll
to `examples/python-dataflow/sender.py`. Code review surfaced that
the change did not actually make the nightly test pass —
re-running the exact CI invocation
(`dora run examples/python-dataflow/dataflow.yml --uv --stop-after
10s`) still produced ExitCode(143) for all three nodes.

Subsequent local investigation on macos-aarch64 confirmed the
SIGTERM bug is deeper than the sender pattern:

* `receiver.py` and `transformer.py` already use the canonical
  `for event in node:` loop with explicit `break` on STOP. Both
  still report ExitCode(143).
* A trivial sender (10 messages × 100ms = ~1s natural runtime)
  with `--stop-after 5s` still produces ExitCode(143) on all
  three nodes 10s after the soft-stop is sent.
* The dora daemon code at `running_dataflow.rs:361-367` does
  send `NodeEvent::Stop` through each node's subscribe channel
  before the 10-second SIGTERM grace window. But adding debug
  prints to `receiver.py` shows nothing reaches stdout between
  "starting" and the eventual SIGTERM-triggered flush, suggesting
  either:
    - `node = Node()` is blocking longer than expected on macOS, or
    - The daemon's stdout-capture buffers prints until process
      exit, masking what the receiver was actually doing.

Either way, the fix is on the dora-daemon side and out of scope
for a one-line example tweak. #1882 stays open and a follow-up
post on that thread documents the investigation trail.

This PR is now scoped narrowly to the ROS2 numpy fix, which IS
verifiable: the test fails today with `ModuleNotFoundError`, the
one-line `pip install pyarrow numpy` change resolves it.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@heyong4725 heyong4725 force-pushed the fix/nightly-sender-stop-and-ros2-numpy branch from 0842178 to 5e73936 Compare May 20, 2026 15:02
@heyong4725 heyong4725 changed the title fix(nightly): well-behaved python-dataflow sender + numpy for ROS2 bridge (closes #1882) fix(nightly): install numpy for ROS2 bridge embedded Python test May 20, 2026
@heyong4725
Copy link
Copy Markdown
Collaborator Author

/trunk merge

@trunk-io trunk-io Bot merged commit 06f04bc into main May 20, 2026
41 of 42 checks passed
@trunk-io trunk-io Bot deleted the fix/nightly-sender-stop-and-ros2-numpy branch May 20, 2026 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant