Skip to content

feat: Carbide side changes to support Astra#2904

Merged
srinivasadmurthy merged 11 commits into
NVIDIA:mainfrom
srinivasadmurthy:sdmagentv2
Jun 30, 2026
Merged

feat: Carbide side changes to support Astra#2904
srinivasadmurthy merged 11 commits into
NVIDIA:mainfrom
srinivasadmurthy:sdmagentv2

Conversation

@srinivasadmurthy

Copy link
Copy Markdown
Contributor

Carbide side changes for Astra support. We still need to make changes in Forge DPU agent, and DPF changes
to ingest Astra NICs.

Related issues

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
@srinivasadmurthy srinivasadmurthy requested a review from a team as a code owner June 26, 2026 00:12
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Astra config and status fields are added to Forge RPCs and propagated through DPU and agent status paths. SVPC handling is added for scout commands and MLX reports. DPA interface lookups now accept explicit search filters and updated call sites use the new API.

Changes

Astra, SVPC, and DPA search updates

Layer / File(s) Summary
RPC surface and module exports
crates/api-model/src/dpa_interface/mod.rs, crates/api-model/src/instance/config/spx.rs, crates/rpc/build.rs, crates/rpc/proto/forge.proto, rest-api/flow/internal/nicoapi/nicoproto/nico.proto, crates/api-core/src/handlers/mod.rs
forge.proto and nico.proto add Astra config/status messages and fields, DpaSearchConfig is added, build.rs derives Serialize for the new generated types, and handlers/mod.rs exports the new handler modules.
DPA search filter plumbing
crates/api-db/src/dpa_interface.rs, crates/api-core/src/handlers/dpa.rs, crates/api-core/src/handlers/instance.rs, crates/api-core/src/instance/mod.rs, crates/dpa-manager/src/lib.rs, crates/machine-controller/src/io.rs
find_by_machine_id accepts DpaSearchConfig, filters Svpc/Astra interfaces in SQL, removes get_dpa_vni, and the updated call sites pass explicit search settings.
Astra config and status handlers
crates/api-core/src/handlers/astra.rs, crates/api-core/src/handlers/dpu.rs, crates/api-model/src/instance/config/spx.rs, crates/agent/src/main_loop.rs, crates/agent/src/ethernet_virtualization.rs, crates/agent/src/tests/full.rs, crates/api-core/src/tests/common/api_fixtures/mod.rs, crates/api-core/src/tests/dpu_agent_upgrade.rs, crates/api-core/src/tests/dpu_info_list.rs, crates/api-core/src/tests/machine_network.rs, crates/api-core/src/tests/network_security_group.rs, crates/machine-a-tron/src/api_client.rs, crates/rpc/src/model/machine/network.rs, crates/test-harness/src/machine_dpu.rs
get_astra_config builds AstraConfig from host interfaces, process_astra_config_status records attachment observations, record_dpu_network_status forwards astra_config_status, the DPU network-status conversion and client payloads initialize the new Astra fields, and the agent/test payloads set the new response fields.
SVPC scout and report handling
crates/api-core/src/api.rs, crates/api-core/src/handlers/dpa.rs, crates/api-core/src/handlers/machine_scout.rs, crates/api-core/src/handlers/svpc.rs, crates/api-core/src/tests/dpa_interfaces.rs, crates/dpa-manager/src/card_handler/svpc.rs
The API and machine-scout paths now call into the new svpc handlers, which implement scout-command generation, mlx device/observation report processing, and get_dpa_by_mac; ensure_interface is crate-visible, and the SVPC card-handler comments are updated.

Sequence Diagram(s)

sequenceDiagram
  participant DpuHandler as api-core dpu handler
  participant AstraHandler as api-core astra handler
  participant Db as database
  participant Agent as agent/main_loop
  DpuHandler->>AstraHandler: get_astra_config(snapshot)
  AstraHandler->>Db: resolve VNI and persist status observations
  DpuHandler-->>DpuHandler: include astra_config / astra_config_status
  Agent->>DpuHandler: record DpuNetworkStatus with astra_config_status=None
Loading
sequenceDiagram
  participant Api as api-core api
  participant Svpc as api-core svpc handler
  participant Db as database
  participant Scout as machine_scout
  Api->>Svpc: publish_mlx_device_report / publish_mlx_observation_report
  Svpc->>Db: update device info / card state
  Scout->>Svpc: process_scout_req(machine_id)
  Svpc-->>Scout: fac::Action
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main Astra support changes on the Carbide side.
Description check ✅ Passed The description is clearly related to the Astra support changes and matches the scope of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
crates/api-core/src/handlers/astra.rs (2)

226-311: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Recommended: correct the misattributed log lines, the stale comment, and adopt structured fields.

Several diagnostics here are copy-paste artifacts from another handler and will mislead operators:

  • Lines 226, 257, 282, 293, 302, 307: messages are prefixed handle_dpa_message: though the enclosing function is process_astra_config_status.
  • Line 266: the comment "We checked that pf_info is not None above, so unwrap is safe" refers to a pf_info check that does not exist in this function.
  • Line 283 references find_by_vni, but the call is db::spx_partition::find_by(...).

Additionally, per the logfmt guideline, prefer structured key/value fields over interpolation (e.g. tracing::error!(%vni, error = %e, "...")) so logs remain searchable.

As per coding guidelines: "prefer placing common fields as attributes passed to tracing functions instead of using string interpolation."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/handlers/astra.rs` around lines 226 - 311, Update the
diagnostics in process_astra_config_status so the log messages use the correct
handler name instead of the copied handle_dpa_message prefix, remove the stale
pf_info comment, and fix the find_by_vni wording to match
db::spx_partition::find_by. While touching the tracing calls around
ConfigVersion::from_str and the spx_partition lookup, switch to structured
tracing fields (for example include vni, obs, and error as attributes) instead
of interpolating values into the message text.

Source: Coding guidelines


82-88: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Recommended: avoid silently swallowing the transaction-begin failure.

When database_connection.begin() fails, the handler logs and returns Ok(None), masking an infrastructure error as a legitimate "no Astra config" result and diverging from the ?-propagation used at Line 64. Propagate the error instead so callers can distinguish "absent" from "failed".

🔧 Proposed fix
-    let mut txn = match api.database_connection.begin().await {
-        Ok(t) => t,
-        Err(e) => {
-            tracing::error!("handle_dpa_message: Unable to start txn: {:#?}", e);
-            return Ok(None);
-        }
-    };
+    let mut txn = api.txn_begin().await?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/handlers/astra.rs` around lines 82 - 88, The transaction
start failure in handle_dpa_message is being swallowed by logging and returning
Ok(None), which hides an infrastructure error as a normal “no result” case.
Update the database_connection.begin() match so the Err path propagates the
error instead of converting it to None, keeping behavior consistent with the
existing ? usage in the same handler and preserving the distinction between
“absent config” and “failed to begin txn”.
crates/rpc/proto/forge.proto (1)

7529-7553: 🗄️ Data Integrity & Integration | 🔵 Trivial | 💤 Low value

Align vni (and subnet_mask) integer types with the rest of forge.proto.

AstraAttachment.vni and AstraAttachmentStatus.vni are declared as int32, whereas every other VNI field in this file (e.g. vpc_vni, internet_l3_vni, site_global_vpc_vni) is uint32. VNIs are unsigned 24-bit values; using a signed type here is inconsistent and invites avoidable casts on the Rust side. The same observation applies to subnet_mask. As these messages are new, changing them now is wire-cost-free.

-  int32 vni = 2;
+  uint32 vni = 2;
   string subnet_ipv4 = 3;
-  int32 subnet_mask = 4;
+  uint32 subnet_mask = 4;

As per path instructions (review protobuf for clear naming and validation implications).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/rpc/proto/forge.proto` around lines 7529 - 7553, `AstraAttachment` and
`AstraAttachmentStatus` use signed `int32` for `vni` and `subnet_mask`, which is
inconsistent with the rest of `forge.proto` and the unsigned nature of these
values. Update the `AstraAttachment` and `AstraAttachmentStatus` message fields
to use `uint32` for `vni` and `subnet_mask`, keeping the existing field numbers
and names unchanged so the new messages remain wire-safe and match the other VNI
fields in this proto.

Source: Path instructions

crates/api-core/src/instance/mod.rs (1)

705-711: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Hoist the constant config and prefer Default.

The DpaSearchConfig is identical on every iteration and equals Default::default(). Construct it once outside the loop and rely on the derive:

+    let dpa_search_config = DpaSearchConfig::default();
     for mid in &machine_ids {
-        let dpa_search_config = DpaSearchConfig {
-            only_svpc: false,
-            only_astra: false,
-        };
         let dpa_interfaces =
             db::dpa_interface::find_by_machine_id(&mut txn, *mid, dpa_search_config).await?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/instance/mod.rs` around lines 705 - 711, The
`DpaSearchConfig` in the machine-id loop is constant and should not be rebuilt
on every iteration. Hoist the config construction out of the `for mid in
&machine_ids` loop in `instance/mod.rs`, and prefer using
`DpaSearchConfig::default()` since the values match the derived default. Keep
the loop body using the shared config when calling
`db::dpa_interface::find_by_machine_id`.
crates/api-model/src/dpa_interface/mod.rs (1)

79-83: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoff

Consider modelling the mutually-exclusive filter as an enum to make the contract compiler-enforced.

The downstream consumer db::dpa_interface::find_by_machine_id rejects the only_svpc == true && only_astra == true combination at runtime, returning DatabaseError::Internal. A two-boolean representation makes an invalid state representable and pushes the invariant to runtime. A small enum captures the intent precisely and removes the need for the defensive runtime check entirely:

♻️ Suggested shape
#[derive(Default, Clone, Copy)]
pub enum DpaInterfaceFilter {
    #[default]
    All,
    OnlySvpc,
    OnlyAstra,
}

Separately, since Default already yields { only_svpc: false, only_astra: false }, the several "no filter" call sites can simply use DpaSearchConfig::default() rather than spelling out both false fields.

As per coding guidelines (STYLE_GUIDE.md: "Prefer designs that are hard to misuse. The more the compiler can catch bugs, the better.").

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-model/src/dpa_interface/mod.rs` around lines 79 - 83, Replace the
two-boolean DpaSearchConfig in DpaSearchConfig with a single enum-based filter
so the mutually exclusive state is compiler-enforced instead of checked at
runtime. Update the DpaSearchConfig definition and any callers of
db::dpa_interface::find_by_machine_id to use the new enum (e.g. All, OnlySvpc,
OnlyAstra), and remove the defensive “both true” runtime validation once the
invalid combination can no longer be expressed. For the no-filter cases, switch
call sites to DpaSearchConfig::default() rather than spelling out both false
fields.

Source: Path instructions

crates/api-core/src/handlers/svpc.rs (1)

364-443: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Eliminate the per-iteration clone of the entire snapshot vector.

get_dpa_by_mac consumes a Vec<DpaInterface>, forcing dpa_snapshots.clone() on every observation (Line 373). For a report with m observations against n interfaces this turns an O(n·m) lookup into O(n·m) plus m full-vector deep clones. Borrow the slice and clone only the single matched entry. The subsequent dpa.clone() at Line 432 is likewise redundant, as dpa is not read after the call.

♻️ Proposed change
-        let mut dpa = match get_dpa_by_mac(&devinfo, dpa_snapshots.clone()) {
+        let mut dpa = match get_dpa_by_mac(&devinfo, &dpa_snapshots) {
-        match dpa_interface::update_card_state(&mut txn, dpa.clone()).await {
+        match dpa_interface::update_card_state(&mut txn, dpa).await {

And adjust the helper to borrow:

-fn get_dpa_by_mac(devinfo: &MlxDeviceInfo, dpas: Vec<DpaInterface>) -> CarbideResult<DpaInterface> {
-    dpas.into_iter()
-        .find(|dpa| dpa.mac_address.to_string() == devinfo.base_mac)
+fn get_dpa_by_mac(devinfo: &MlxDeviceInfo, dpas: &[DpaInterface]) -> CarbideResult<DpaInterface> {
+    dpas.iter()
+        .find(|dpa| dpa.mac_address.to_string() == devinfo.base_mac)
+        .cloned()
         .ok_or_else(|| CarbideError::NotFoundError {
             kind: "mac_addr",
             id: devinfo.base_mac.to_string(),
         })
 }

As per coding guidelines: "Avoid needless clones. Seeing .clone() frequently indicates the ownership model may need rethinking."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/handlers/svpc.rs` around lines 364 - 443, The observation
loop in process_mlx_observation is doing unnecessary full-vector and per-item
clones because get_dpa_by_mac takes ownership of dpa_snapshots and
update_card_state is called with dpa.clone(). Update get_dpa_by_mac to borrow
the snapshot list instead of consuming a Vec, then pass the borrowed slice from
the loop so only the matched DpaInterface is cloned if needed. Also remove the
redundant dpa.clone() before calling dpa_interface::update_card_state, since dpa
is not used afterward.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/api-core/src/handlers/astra.rs`:
- Around line 315-317: The `handle_astra` path in `astra.rs` is panicking on
untrusted input by calling `MacAddress::from_str(&obs.mac_address).unwrap()`
inside the `dpa_interfaces.iter().find(...)` lookup. Parse `obs.mac_address`
once before the search, handle the parse failure gracefully, and skip the
observation/continue processing when the MAC string is invalid instead of
aborting the request thread.
- Around line 223-352: The SPX status observation write is lost because the
transaction started in handle_dpa_message is never committed after
db::machine::update_spx_status_observation succeeds. After the update call,
explicitly commit the txn before returning Ok(()) so the changes persist, and
keep the existing error handling around begin/update paths intact. Use the
handle_dpa_message flow and update_spx_status_observation call as the key spots
to verify the transaction is finalized.

---

Nitpick comments:
In `@crates/api-core/src/handlers/astra.rs`:
- Around line 226-311: Update the diagnostics in process_astra_config_status so
the log messages use the correct handler name instead of the copied
handle_dpa_message prefix, remove the stale pf_info comment, and fix the
find_by_vni wording to match db::spx_partition::find_by. While touching the
tracing calls around ConfigVersion::from_str and the spx_partition lookup,
switch to structured tracing fields (for example include vni, obs, and error as
attributes) instead of interpolating values into the message text.
- Around line 82-88: The transaction start failure in handle_dpa_message is
being swallowed by logging and returning Ok(None), which hides an infrastructure
error as a normal “no result” case. Update the database_connection.begin() match
so the Err path propagates the error instead of converting it to None, keeping
behavior consistent with the existing ? usage in the same handler and preserving
the distinction between “absent config” and “failed to begin txn”.

In `@crates/api-core/src/handlers/svpc.rs`:
- Around line 364-443: The observation loop in process_mlx_observation is doing
unnecessary full-vector and per-item clones because get_dpa_by_mac takes
ownership of dpa_snapshots and update_card_state is called with dpa.clone().
Update get_dpa_by_mac to borrow the snapshot list instead of consuming a Vec,
then pass the borrowed slice from the loop so only the matched DpaInterface is
cloned if needed. Also remove the redundant dpa.clone() before calling
dpa_interface::update_card_state, since dpa is not used afterward.

In `@crates/api-core/src/instance/mod.rs`:
- Around line 705-711: The `DpaSearchConfig` in the machine-id loop is constant
and should not be rebuilt on every iteration. Hoist the config construction out
of the `for mid in &machine_ids` loop in `instance/mod.rs`, and prefer using
`DpaSearchConfig::default()` since the values match the derived default. Keep
the loop body using the shared config when calling
`db::dpa_interface::find_by_machine_id`.

In `@crates/api-model/src/dpa_interface/mod.rs`:
- Around line 79-83: Replace the two-boolean DpaSearchConfig in DpaSearchConfig
with a single enum-based filter so the mutually exclusive state is
compiler-enforced instead of checked at runtime. Update the DpaSearchConfig
definition and any callers of db::dpa_interface::find_by_machine_id to use the
new enum (e.g. All, OnlySvpc, OnlyAstra), and remove the defensive “both true”
runtime validation once the invalid combination can no longer be expressed. For
the no-filter cases, switch call sites to DpaSearchConfig::default() rather than
spelling out both false fields.

In `@crates/rpc/proto/forge.proto`:
- Around line 7529-7553: `AstraAttachment` and `AstraAttachmentStatus` use
signed `int32` for `vni` and `subnet_mask`, which is inconsistent with the rest
of `forge.proto` and the unsigned nature of these values. Update the
`AstraAttachment` and `AstraAttachmentStatus` message fields to use `uint32` for
`vni` and `subnet_mask`, keeping the existing field numbers and names unchanged
so the new messages remain wire-safe and match the other VNI fields in this
proto.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ae9d3091-e3cf-422a-8b3b-de8e1cb3fb88

📥 Commits

Reviewing files that changed from the base of the PR and between 2acb7c3 and 2582735.

📒 Files selected for processing (29)
  • crates/agent/src/ethernet_virtualization.rs
  • crates/agent/src/main_loop.rs
  • crates/agent/src/tests/full.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/handlers/astra.rs
  • crates/api-core/src/handlers/dpa.rs
  • crates/api-core/src/handlers/dpu.rs
  • crates/api-core/src/handlers/instance.rs
  • crates/api-core/src/handlers/machine_scout.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/handlers/svpc.rs
  • crates/api-core/src/instance/mod.rs
  • crates/api-core/src/tests/common/api_fixtures/mod.rs
  • crates/api-core/src/tests/dpa_interfaces.rs
  • crates/api-core/src/tests/dpu_agent_upgrade.rs
  • crates/api-core/src/tests/dpu_info_list.rs
  • crates/api-core/src/tests/machine_network.rs
  • crates/api-core/src/tests/network_security_group.rs
  • crates/api-db/src/dpa_interface.rs
  • crates/api-model/src/dpa_interface/mod.rs
  • crates/api-model/src/instance/config/spx.rs
  • crates/dpa-manager/src/card_handler/svpc.rs
  • crates/dpa-manager/src/lib.rs
  • crates/machine-a-tron/src/api_client.rs
  • crates/machine-controller/src/io.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto
  • crates/rpc/src/model/machine/network.rs
  • crates/test-harness/src/machine_dpu.rs

Comment thread crates/api-core/src/handlers/astra.rs Outdated
Comment thread crates/api-core/src/handlers/astra.rs Outdated
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 288 6 26 105 7 144
machine-validation-runner 751 30 190 274 36 221
machine_validation 751 30 190 274 36 221
machine_validation-aarch64 751 30 190 274 36 221
nvmetal-carbide 751 30 190 274 36 221
TOTAL 3298 126 786 1207 151 1028

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/api-core/src/handlers/svpc.rs (1)

467-471: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Propagate update_card_state failures instead of committing partial success.

If this DB update fails, the handler logs the error, commits, and returns success, so Scout can drop a report that never advanced the DPA card state.

Proposed fix
-        match dpa_interface::update_card_state(&mut txn, dpa).await {
-            Ok(_id) => (),
-            Err(e) => {
-                tracing::error!("process_mlx_observation update_card_state error: {e}");
-            }
-        }
+        dpa_interface::update_card_state(&mut txn, dpa)
+            .await
+            .map_err(|e| {
+                tracing::error!(error = %e, "process_mlx_observation update_card_state failed");
+                e
+            })?;

As per path instructions, review crates/api*/** changes for “transaction safety” and “SQLx/query correctness.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/handlers/svpc.rs` around lines 467 - 471, The transaction
in process_mlx_observation is swallowing failures from
dpa_interface::update_card_state, which lets the handler commit and return
success after a failed state update. Change the match in process_mlx_observation
to propagate the error instead of only logging it, so the surrounding
transaction aborts and the caller sees the failure. Use the existing
update_card_state call site and the transaction/return path in svpc.rs to ensure
no partial success is committed when this DB update fails.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/api-core/src/handlers/svpc.rs`:
- Around line 467-471: The transaction in process_mlx_observation is swallowing
failures from dpa_interface::update_card_state, which lets the handler commit
and return success after a failed state update. Change the match in
process_mlx_observation to propagate the error instead of only logging it, so
the surrounding transaction aborts and the caller sees the failure. Use the
existing update_card_state call site and the transaction/return path in svpc.rs
to ensure no partial success is committed when this DB update fails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 97537660-1bf4-44a7-9111-c59938a46e35

📥 Commits

Reviewing files that changed from the base of the PR and between bf542a8 and 6734b9e.

📒 Files selected for processing (3)
  • crates/api-core/src/handlers/astra.rs
  • crates/api-core/src/handlers/svpc.rs
  • crates/api-core/src/instance/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/api-core/src/instance/mod.rs
  • crates/api-core/src/handlers/astra.rs

@github-actions

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-06-29 17:18:07 UTC | Commit: 6c522ab

Comment thread crates/rpc/proto/forge.proto
Comment thread crates/api-core/src/handlers/astra.rs Outdated
Comment thread crates/api-core/src/handlers/astra.rs
Comment thread crates/api-core/src/handlers/astra.rs
Comment thread crates/rpc/proto/forge.proto
Comment thread crates/api-core/src/handlers/astra.rs
Comment thread crates/api-core/src/instance/mod.rs
Comment thread crates/api-model/src/dpa_interface/mod.rs
Comment thread crates/rpc/proto/forge.proto Outdated
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Comment thread crates/api-core/src/handlers/astra.rs
Comment thread crates/rpc/proto/forge.proto Outdated
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Re-run `make -C rest-api core-proto` and `make -C rest-api/flow gen-nicoapi-pb`
so REST workflow-schema and Flow nicoapi generated code matches the current
Core protos and CI protoc plugin versions.

Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Change AstraAttachment.vni to uint32 in forge.proto, regenerate REST Core
protobuf artifacts, and cast the DB i32 VNI when building AstraAttachment.

Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
@srinivasadmurthy srinivasadmurthy merged commit 18af75a into NVIDIA:main Jun 30, 2026
117 checks passed
@srinivasadmurthy srinivasadmurthy deleted the sdmagentv2 branch June 30, 2026 04:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants