Conversation
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ec3cf29d13
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af41ade474
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR updates the dagrs runtime and dagrs-derive macros to the stated stable API surface (dagrs 0.8.1 / dagrs-derive 0.5.0), centered on structured errors (DagrsError/ErrorCode), updated event/termination contracts, and more correct checkpoint resume behavior (including output replay).
Changes:
- Introduces/propagates a unified structured error model across graph build, runtime, channels, hooks, and checkpoints; updates events to end with
ExecutionTerminated. - Enhances checkpoint/resume semantics by making node execution state explicit (
NodeExecStatus) and supporting replay of serialized outputs during resume; expands/updates tests accordingly. - Updates
dagrs-derivemacros and adds UI fixtures/tests; also includes broad Buck build rule refactors across multiple crates.
Reviewed changes
Copilot reviewed 75 out of 80 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| project/slayerfs/BUCK | Renames vendor/manifest targets and adds Windows-specific rustc_flags select. |
| project/rks/BUCK | Renames vendor/manifest targets and updates Windows rustc_flags wiring. |
| project/rkl/BUCK | Renames vendor/manifest targets and updates Windows rustc_flags wiring. |
| project/rkforge/BUCK | Renames vendor/manifest targets and adds Windows-specific rustc_flags select. |
| project/rfuse3/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/libvault/BUCK | Renames vendor/manifest/buildscript targets and updates env/rustc_flags references. |
| project/libscheduler/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/libruntime/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/libnetwork/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/libfuse-fs/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/libcsi/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/libcni/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/distribution/BUCK | Renames vendor/manifest targets and adds Windows-specific rustc_flags select. |
| project/dagrs/tests/v2_test.rs | Updates tests to handle Result-returning build APIs. |
| project/dagrs/tests/typed_channel_test.rs | Updates tests to unwrap add_node/add_edge results. |
| project/dagrs/tests/router_reuse_test.rs | Updates build/reset calls to return/unwrap Result. |
| project/dagrs/tests/router_merge_test.rs | Updates build calls for Result-returning APIs. |
| project/dagrs/tests/reset_policy_test.rs | Adds coverage for reset environment lifecycle policy. |
| project/dagrs/tests/loop_reset_test.rs | Updates reset call to unwrap Result. |
| project/dagrs/tests/hook_test.rs | Updates hook contract (no on_error), retries, and termination events. |
| project/dagrs/tests/event_test.rs | Updates event model assertions (skip reasons, durations, termination event). |
| project/dagrs/tests/checkpoint_test.rs | Expands checkpoint/resume tests and adapts to new NodeState model. |
| project/dagrs/tests/chain_skip_test.rs | Updates build calls for Result-returning APIs. |
| project/dagrs/tests/branch_pruning_test.rs | Updates build calls for Result-returning APIs. |
| project/dagrs/src/utils/output.rs | Reworks Output to carry DagrsError and adds execution_failed helper. |
| project/dagrs/src/utils/hook.rs | Updates hook docs/contracts and retry error type to DagrsError. |
| project/dagrs/src/utils/execstate.rs | Adjusts mutex handling and exposes is_success. |
| project/dagrs/src/utils/checkpoint.rs | Introduces NodeExecStatus/StoredOutputKind, new checkpoint ordering, and DagrsError-based store APIs. |
| project/dagrs/src/node/typed_action.rs | Adapts typed channels to new InChannels internal structure. |
| project/dagrs/src/node/mod.rs | Adds restore_from_checkpoint hook; hardens EnvVar node-table lookup. |
| project/dagrs/src/node/loop_node.rs | Adds loop condition/node checkpoint restore hooks. |
| project/dagrs/src/lib.rs | Re-exports structured error types and updated checkpoint types. |
| project/dagrs/src/graph/loop_subgraph.rs | Makes LoopSubgraph build APIs fallible and avoids panics on execution. |
| project/dagrs/src/graph/event.rs | Replaces GraphFinished with ExecutionTerminated, adds typed error payloads and skip reasons. |
| project/dagrs/src/graph/error.rs | Defines ErrorCode, DagrsError, and ErrorContext as the unified error surface. |
| project/dagrs/src/graph/abstract_graph.rs | Makes abstract edge insertion fallible and de-panics missing-node paths. |
| project/dagrs/src/connection/out_channel.rs | Converts channel errors into structured DagrsError and updates APIs accordingly. |
| project/dagrs/src/connection/in_channel.rs | Converts channel errors into structured DagrsError and adds disabled-sender support for resume/control-flow. |
| project/dagrs/README.md | Updates docs for new build/runtime/error/event contracts and versions. |
| project/dagrs/examples/typed_action.rs | Updates example to unwrap add_node/add_edge results. |
| project/dagrs/examples/recv_any_example.rs | Updates example to unwrap add_node/add_edge results. |
| project/dagrs/examples/loop_dag.rs | Updates example to unwrap build calls. |
| project/dagrs/examples/hello_dagrs.rs | Updates example to unwrap add_node result. |
| project/dagrs/examples/dagrs-sklearn/tests/dag_job_test.rs | Updates expected failure type to ErrorCode-based structured errors. |
| project/dagrs/examples/dagrs-sklearn/src/parser/yaml_parser.rs | Propagates build errors from add_node/add_edge via Result. |
| project/dagrs/examples/dagrs-sklearn/src/command_action.rs | Maps command execution failures into DagrsError with details. |
| project/dagrs/examples/dagrs-sklearn/examples/sklearn.rs | Adapts output error handling to DagrsError. |
| project/dagrs/examples/dagrs-sklearn/examples/notebook.ipynb | Updates embedded code to new Output::Err(DagrsError) handling. |
| project/dagrs/examples/dagrs-sklearn/Cargo.toml | Declares a workspace boundary for the example crate. |
| project/dagrs/examples/custom_node.rs | Updates example to unwrap add_node result. |
| project/dagrs/examples/conditional_node.rs | Updates example to unwrap build calls. |
| project/dagrs/examples/compute_dag.rs | Updates example to unwrap build calls. |
| project/dagrs/examples/auto_relay.rs | Updates example main signature and propagates dependencies!/run errors. |
| project/dagrs/examples/auto_node.rs | Adjusts generics ordering/bounds for auto-node example. |
| project/dagrs/docs/CHANGELOG.md | Adds v0.8.1 changelog with migration notes. |
| project/dagrs/Cargo.toml | Bumps dagrs version to 0.8.1. |
| project/dagrs/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/dagrs-derive/tests/ui/pass/auto_node_success/src/main.rs | Adds passing UI fixture for auto_node macro behavior. |
| project/dagrs-derive/tests/ui/pass/auto_node_success/Cargo.toml | Adds pass fixture manifest for UI compilation test. |
| project/dagrs-derive/tests/ui/pass/auto_node_success/Cargo.lock | Adds lockfile for pass UI fixture. |
| project/dagrs-derive/tests/ui/fail/tuple_struct/src/main.rs | Adds failing UI fixture for tuple struct diagnostics. |
| project/dagrs-derive/tests/ui/fail/tuple_struct/Cargo.toml | Adds fail fixture manifest for tuple struct case. |
| project/dagrs-derive/tests/ui/fail/tuple_struct/Cargo.lock | Adds lockfile for tuple struct fail fixture. |
| project/dagrs-derive/tests/ui/fail/reserved_field/src/main.rs | Adds failing UI fixture for reserved field diagnostics. |
| project/dagrs-derive/tests/ui/fail/reserved_field/Cargo.toml | Adds fail fixture manifest for reserved field case. |
| project/dagrs-derive/tests/ui/fail/reserved_field/Cargo.lock | Adds lockfile for reserved field fail fixture. |
| project/dagrs-derive/tests/dependencies_macro.rs | Adds tests ensuring dependencies! propagates build errors and supports ?. |
| project/dagrs-derive/tests/auto_node_ui.rs | Adds cargo-driven UI test runner for fixtures. |
| project/dagrs-derive/src/relay.rs | Updates macro expansion to propagate build errors and avoid identifier collisions. |
| project/dagrs-derive/src/lib.rs | Updates macro docs/contracts; documents dependencies! returns Result. |
| project/dagrs-derive/src/auto_node.rs | Improves generics handling, reserved-field diagnostics, and removes unsafe Send/Sync injection. |
| project/dagrs-derive/README.md | Documents macro contract targeting dagrs 0.8.1 and updated behavior. |
| project/dagrs-derive/Cargo.toml | Bumps dagrs-derive version to 0.5.0. |
| project/dagrs-derive/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/common/BUCK | Renames vendor/manifest targets for Buck rule consistency. |
| project/Cargo.toml | Updates workspace dependency version for dagrs-derive. |
| project/Cargo.lock | Updates locked versions for dagrs and dagrs-derive. |
| project/aardvark-dns/BUCK | Renames vendor/manifest/buildscript targets and adds Windows-specific rustc_flags select. |
| buckal.snap | Updates buckal fingerprints after crate/version/buildrule changes. |
| let disabled = self.1.clone(); | ||
| let futures = self | ||
| .0 | ||
| .iter_mut() | ||
| .filter(|(id, _)| !disabled.contains(id)) | ||
| .map(|(_, c)| async { | ||
| let content: Content = c.lock().await.recv(NodeId(0)).await?; | ||
| Ok(content.into_inner()) | ||
| }); | ||
| join_all(futures).await.into_iter().map(f).collect() |
There was a problem hiding this comment.
Same issue as InChannels::map: TypedInChannels::map uses recv(NodeId(0)), so any channel errors will carry the wrong channel_id in DagrsError.context. Pass the actual sender NodeId into recv to keep error context accurate.
project/dagrs/src/utils/execstate.rs
Outdated
| /// After the task is successfully executed, set the execution result. | ||
| pub(crate) fn set_output(&self, output: Output) { | ||
| self.success.store(true, Ordering::Relaxed); | ||
| *self.output.lock().unwrap() = output; | ||
| if let Ok(mut guard) = self.output.lock() { | ||
| *guard = output; | ||
| } | ||
| } | ||
|
|
||
| /// [`Output`] for fetching internal storage. | ||
| /// This function is generally not called directly, but first uses the semaphore for synchronization control. | ||
| pub(crate) fn get_output(&self) -> Option<Content> { | ||
| self.output.lock().unwrap().get_out() | ||
| self.output.lock().ok().and_then(|guard| guard.get_out()) | ||
| } | ||
| pub(crate) fn get_full_output(&self) -> Output { | ||
| self.output.lock().unwrap().clone() | ||
| self.output | ||
| .lock() | ||
| .map(|guard| guard.clone()) | ||
| .unwrap_or_else(|_| Output::empty()) |
There was a problem hiding this comment.
ExecState::set_output silently ignores poisoned mutexes (and get_full_output falls back to Output::empty() on poison). That can mask real execution results and make downstream logic think a node produced no output. Consider recovering the poisoned lock via poisoned.into_inner() and still reading/writing the stored Output, or propagate/panic so corruption isn’t silently hidden.
| pub fn add_node(&mut self, node: impl Node + 'static) -> DagrsResult<NodeId> { | ||
| let node_id = node.id(); | ||
| for existing in &self.inner_nodes { | ||
| let existing = existing.try_lock().map_err(|_| { | ||
| DagrsError::new( | ||
| ErrorCode::DgBld0005ConcurrentBuildMutation, | ||
| "failed to acquire loop subgraph node lock while building graph", | ||
| ) | ||
| })?; | ||
| if existing.id() == node_id { | ||
| return Err(DagrsError::new( | ||
| ErrorCode::DgBld0003DuplicateNodeId, | ||
| "duplicate node id detected while building loop subgraph", | ||
| ) | ||
| .with_node_id(node_id.as_usize())); | ||
| } | ||
| } | ||
| self.inner_nodes.push(Arc::new(Mutex::new(node))); | ||
| Ok(node_id) |
There was a problem hiding this comment.
LoopSubgraph::add_node checks for duplicate IDs by try_lock()ing every existing inner node. During graph build this introduces a spurious failure mode (DgBld0005ConcurrentBuildMutation) if any node mutex is locked (even if no real concurrent build mutation is happening). A more robust approach is to track inner node IDs separately (e.g., a HashSet<NodeId>) so duplicate detection is lock-free and deterministic.
| .arg("--quiet") | ||
| .arg("--manifest-path") | ||
| .arg(case_dir.join("Cargo.toml")) | ||
| .env("CARGO_NET_OFFLINE", "true") |
There was a problem hiding this comment.
These UI tests force CARGO_NET_OFFLINE=true, but the fixtures depend on registry crates (e.g. async-trait = "0.1" in tests/ui/pass/auto_node_success). In a clean CI environment without a pre-populated Cargo registry cache, cargo check will fail because it can’t download those dependencies. Either remove the offline setting, vendor the fixture dependencies, or switch to a UI-test approach that doesn’t require registry access.
| .env("CARGO_NET_OFFLINE", "true") |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: edf6ec72e3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Review content is in PR_557_REVIEW.md. The GH_TOKE PAT in the environment is invalid (returns 401 Bad Credentials), preventing direct posting to the GitHub PR thread via API. https://claude.ai/code/session_012q6mfySUQapXeVUmL5mu7v
Signed-off-by: jackie <jackieismpc@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8a1822a58
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| self.abstract_graph.add_edge(from_id, *to_id)?; | ||
|
|
||
| let (tx, rx) = mpsc::channel::<Content>(32); | ||
| from_channel.insert(*to_id, Arc::new(Mutex::new(OutChannel::Mpsc(tx.clone())))); | ||
| rx_map.insert(*to_id, rx); |
There was a problem hiding this comment.
Roll back sender-side edge mutations on receiver-lock errors
This mutates the abstract graph and sender-side channel map before the later try_lock_node_for_build(...) on each target's input channel. If a receiver is concurrently locked, add_edge() now returns Err(DgBld0005...) but leaves the edge half-installed: abstract_graph, from_channel, and in_degree are updated, while the new rx is dropped. Retrying the same add_edge() call then no-ops because from_channel.0.contains_key(to_id) is already true, so that edge remains permanently broken and sends fail against a closed channel.
Useful? React with 👍 / 👎.
Signed-off-by: jackie <jackieismpc@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1464917ea
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6369540a3d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if !matches!(node_state.status, NodeExecStatus::Succeeded) | ||
| || node_state.output_data.is_some() | ||
| || node_state.output_summary.is_none() | ||
| { |
There was a problem hiding this comment.
Rerun channel-only successes during checkpoint resume
Update the rerun predicate to handle succeeded nodes that sent data through OutChannels but returned Output::empty(). In this code path, save_checkpoint_with_loop_node_iterations() records status=Succeeded with both output_data and output_summary unset, and succeeded_node_requires_rerun() therefore returns false; then prune_succeeded_nodes_from_resume_span() can drop that sender from active_nodes. If a pending peer in the same resumed block still waits on that sender (a common pattern in this repo, e.g. actions that broadcast(...) then return empty), resume can deadlock because no replay packet exists and the sender is never rerun.
Useful? React with 👍 / 👎.
#556
summary
dagrs 0.8.1anddagrs-derive 0.5.0, and includes an additional review-and-fix pass on top of the local commit stack.DagrsError/ErrorCode, a single terminal event based onExecutionTerminated, and a consistent contract between graph APIs and derive macros.Graph::add_edgeiterator contract, and keeping tests/docs in sync.change
dagrsresume_from_checkpoint()can continue execution instead of only restoring status flags.NodeStatenow carries aStoredOutputKindtag so checkpoint output data can be restored safely.Graph::add_edgenow acceptsIntoIterator<Item = NodeId>, matching the upgrade plan while remaining backward-compatible with existingVec<NodeId>call sites.set_checkpoint_interval(...)API withset_checkpoint_config(...).dagrsteststest_resume_execution_basicinto a real resume test that verifies completed nodes are not rerun and pending downstream nodes receive checkpointed input.reset_policy_testto cover both defaultreset()environment preservation and explicitreset_with(ResetPolicy::ResetEnv)behavior.StoredOutputKind.dagrs-derivedependencies!type inference afterGraph::add_edgewas widened to an iterator-based signature by collecting explicitly intoVec<NodeId>.validation
cargo fmt --all -- --checkcargo clippy -p dagrs --all-targets --all-features -- -D warningscargo clippy -p dagrs-derive --all-targets --all-features -- -D warningscargo test -p dagrs --all-featurescargo test -p dagrs-derive --all-featurescargo build -p dagrs --examples --all-features