Skip to content

Feat/night shade#400

Open
distributedstatemachine wants to merge 2 commits into
mainfrom
feat/night_shade
Open

Feat/night shade#400
distributedstatemachine wants to merge 2 commits into
mainfrom
feat/night_shade

Conversation

@distributedstatemachine
Copy link
Copy Markdown
Contributor

@distributedstatemachine distributedstatemachine commented Apr 1, 2026

Summary

Brief description of what this PR does.

Related Issues

Closes #(issue number)

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring

Changes Made

List the main changes in this PR:

Testing

How Has This Been Tested?

Describe the tests you ran to verify your changes.

  • Unit tests pass (cargo test)
  • Integration tests pass
  • Manual testing completed

Test Configuration

  • OS:
  • Rust version:
  • Bittensor version (if applicable):

Checklist

  • My code follows the project's style guidelines
  • I have run cargo fmt to format my code
  • I have run cargo clippy and addressed all warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Context

Add any other context about the PR here.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added end-to-end encryption support for confidential compute deployments.
    • CLI now supports --confidential and --trust-tier flags for encrypted deployments (Enterprise or Community tiers).
    • SDK includes new methods to invoke confidential workloads with automatic encryption/decryption handling.
    • Streaming support for encrypted responses from confidential compute instances.

Introduce the Night Shade ML-KEM primitives and protocol definitions so clients, APIs, and enclaves share a single encrypted wire format.

Made-with: Cursor
Add Night Shade support to the SDK, Python bindings, and CLI so users can invoke encrypted workloads and request confidential deployments from the public surface area.

Made-with: Cursor
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 1, 2026

Walkthrough

This pull request introduces end-to-end encryption (E2EE) support for confidential compute workloads. It adds a new basilica-e2ee crate implementing ML-KEM-based key encapsulation and AES-256-GCM encryption, corresponding Protocol Buffer definitions, new CLI flags for confidential deployment configuration, and SDK methods for encrypted request/response handling and streaming.

Changes

Cohort / File(s) Summary
Workspace & Dependencies
Cargo.toml, crates/basilica-sdk/Cargo.toml
Added basilica-e2ee as workspace member and dependency; expanded workspace dependencies with crypto/security packages (hkdf, kem, ml-kem, sharks).
E2EE Core Implementation
crates/basilica-e2ee/Cargo.toml, crates/basilica-e2ee/src/lib.rs
New E2EE crate providing ML-KEM key encapsulation, AES-GCM encryption/decryption, Shamir secret sharing, attestation verification, and streaming envelope handling with 643 lines of crypto logic and public APIs.
Protocol Definitions
crates/basilica-protocol/proto/e2e.proto, crates/basilica-protocol/build.rs, crates/basilica-protocol/src/lib.rs
Added e2e.proto with ConfidentialComputeService RPC definitions, message types for attestation/E2EE descriptors, and protobuf code generation with serde derives.
CLI Confidential Compute Support
crates/basilica-cli/src/cli/commands.rs, crates/basilica-cli/src/cli/handlers/deploy/create.rs
Added TrustTierArg enum and ConfidentialOptions struct to CLI; flattened confidential options into DeployCommand; updated create handler to map CLI flags to request fields.
Deployment Template Handlers
crates/basilica-cli/src/cli/handlers/deploy/templates/openclaw.rs, crates/basilica-cli/src/cli/handlers/deploy/templates/sglang.rs, crates/basilica-cli/src/cli/handlers/deploy/templates/tau.rs, crates/basilica-cli/src/cli/handlers/deploy/templates/vllm.rs
Updated all four deployment templates to include explicit confidential: false and trust_tier: None fields in CreateDeploymentRequest.
SDK Client Integration
crates/basilica-sdk/src/client.rs, crates/basilica-sdk/src/types.rs
Added four new public methods (get_confidential_instance, invoke_confidential_json, invoke_confidential_bytes, invoke_confidential_stream) and private helper post_bytes; extended CreateDeploymentRequest with confidential and trust_tier fields; re-exported E2EE types from basilica-e2ee.
Python SDK Bindings
crates/basilica-sdk-python/Cargo.toml, crates/basilica-sdk-python/src/lib.rs, crates/basilica-sdk-python/src/types.rs
Added serde_json dependency; implemented two new Python methods (get_confidential_instance, invoke_confidential_json) with error mapping; updated CreateDeploymentRequest bindings to accept confidential and trust_tier parameters.

Sequence Diagrams

sequenceDiagram
    participant Client
    participant SDK
    participant E2EE
    participant Server
    participant Response as Response Decrypt

    Client->>SDK: invoke_confidential_bytes(workload_id, path, payload)
    SDK->>SDK: get_confidential_instance(workload_id)
    Server-->>SDK: E2eeInstanceDescriptor
    SDK->>SDK: verify attestation
    SDK->>E2EE: seal_request(server_pubkey, envelope)
    E2EE->>E2EE: ML-KEM encapsulate<br/>AES-256-GCM encrypt
    E2EE-->>SDK: encrypted blob + client_secret
    SDK->>Server: POST /v1/e2e/invoke (encrypted blob)
    Server-->>SDK: encrypted response blob
    SDK->>Response: open_response(client_secret, blob)
    Response->>Response: ML-KEM decapsulate<br/>AES-256-GCM decrypt
    Response-->>Client: plaintext payload
Loading
sequenceDiagram
    participant Client
    participant SDK
    participant E2EE
    participant Server
    participant SSE as SSE Handler

    Client->>SDK: invoke_confidential_stream(workload_id, path, payload)
    SDK->>E2EE: seal_request (stream: true)
    E2EE-->>SDK: encrypted blob + client_secret
    SDK->>Server: POST /v1/e2e/invoke (encrypted blob)
    Server-->>SDK: SSE response stream
    SDK->>SSE: read SSE lines
    loop for each chunk
        SSE->>E2EE: open_stream_chunk(stream_key, chunk)
        E2EE-->>SDK: decrypted chunk
        SDK->>Client: buffer line
    end
    SDK-->>Client: Vec<String> (all decrypted lines)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Suggested reviewers

  • epappas

🐰 A rabbit hops through gates of secret keys,

ML-KEM whispers, AES-GCM ensures it's sealed,

Attestations verified, trust tiers high,

Confidential bytes now dance—encrypted, revealed!

Night Shade whispers bloom where none could spy. 🌙

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Feat/night shade' is vague and generic—it uses a feature branch naming convention rather than a descriptive summary. While it references 'night shade,' it fails to convey what was actually implemented (E2EE primitives, protocol definitions, SDK/CLI support for confidential compute) and provides no meaningful information beyond the branch name. Use a more descriptive title like 'Add Night Shade E2EE protocol with ML-KEM primitives and confidential compute support' to clearly summarize the main changes and intent of this substantial feature addition.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/night_shade

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
crates/basilica-e2ee/src/lib.rs (2)

499-510: HKDF salt derivation from public ciphertext is acceptable but worth documenting.

The salt is derived from the first 16 bytes of the ML-KEM ciphertext (line 504). Since the ciphertext is freshly generated per encapsulation and the shared secret provides the entropy, this pattern is cryptographically sound for domain separation. However, a brief comment explaining this design choice would aid future maintainers.

📝 Suggested documentation
 fn derive_aead_key(
     ciphertext: &[u8],
     shared_secret: &[u8],
     info: &[u8],
 ) -> Result<[u8; AEAD_KEY_LEN], E2eeError> {
+    // Use ciphertext prefix as HKDF salt for domain separation.
+    // The shared secret provides cryptographic entropy; the salt ensures
+    // different keys are derived even if the same shared secret were reused.
     let salt = &ciphertext[..ciphertext.len().min(HKDF_SALT_LEN)];
     let hkdf = Hkdf::<Sha256>::new(Some(salt), shared_secret);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-e2ee/src/lib.rs` around lines 499 - 510, Add a short
explanatory comment inside derive_aead_key explaining why the HKDF salt is taken
from the first HKDF_SALT_LEN bytes of the ciphertext (ML-KEM ciphertext): note
that the ciphertext is fresh per encapsulation, the shared_secret provides
entropy, and using the ciphertext prefix as salt is intended for domain
separation (not secrecy) and is cryptographically acceptable; reference the salt
variable, HKDF_SALT_LEN, and the use of Hkdf::<Sha256>::new so
reviewers/maintainers understand the design choice.

464-476: Stream chunk nonce reuse risk with same key.

seal_stream_chunk generates a random 12-byte nonce per chunk (via seal_payload). With AES-256-GCM's 96-bit nonce and birthday bound, nonce collision becomes probable after ~2^32 chunks with the same stream key. For high-throughput streams, consider a counter-based nonce or document the expected chunk limits.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-e2ee/src/lib.rs` around lines 464 - 476, seal_stream_chunk
currently relies on seal_payload to generate a random 12-byte nonce per chunk
which risks AES-GCM nonce collisions at high chunk counts; change the API to use
a deterministic per-chunk nonce derived from a per-stream random salt plus a
monotonic counter (e.g., 64-bit sequence) instead of pure randomness: add a
nonce parameter to seal_payload/open_payload (or a new
seal_payload_with_nonce/open_payload_with_nonce), update seal_stream_chunk and
open_stream_chunk to accept/consume a sequence number (or produce a wrapper that
takes stream_salt and seq_num and derives a 12-byte nonce by combining
salt||seq_num and truncating or HKDF/AEAD-appropriate derivation), and ensure
AEAD_KEY_LEN/BASE64 usage is unchanged while keeping open_stream_chunk
derivation symmetric so sealing and opening use the exact same nonce derivation.
crates/basilica-cli/src/cli/handlers/deploy/templates/sglang.rs (1)

91-104: Deployment will persist indefinitely unless --ttl is specified.

The SGLang template passes through common.ttl which defaults to None. Users should be aware the deployment persists after creation. Consider documenting this in the help text or defaulting to a reasonable TTL for template deployments.

The explicit confidential: false and trust_tier: None additions are correct for non-confidential template deployments.

As per coding guidelines: "Set --ttl / ttl_seconds for deployments unless the user explicitly wants persistence; call out whether the resource will persist after the current task."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-cli/src/cli/handlers/deploy/templates/sglang.rs` around lines
91 - 104, The template currently passes through ttl_seconds: common.ttl which
defaults to None, causing deployments to persist indefinitely; update the
handler that constructs this template so ttl_seconds is set to a sensible
default (e.g., DEFAULT_TTL_SECONDS) when common.ttl is None or explicitly
document in the CLI help for the deploy command that omitting --ttl creates a
persistent deployment; modify the code path that sets ttl_seconds (the place
using ttl_seconds and common.ttl) to use Some(DEFAULT_TTL_SECONDS) or ensure the
CLI flag parsing for --ttl forces an explicit value, and keep the existing
explicit fields like confidential: false and trust_tier: None unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/basilica-protocol/proto/e2e.proto`:
- Around line 24-32: The proto field E2eeInstanceDescriptor.trust_tier is a
string but the Rust TrustTier enum (in crates/basilica-e2ee/src/lib.rs) only
supports "enterprise" and "community", so add a fallback variant to TrustTier
(e.g., Unknown) and annotate it with serde attributes (rename_all = "snake_case"
and serde(other) plus a default) so deserialization of unknown trust_tier values
won’t fail; update TrustTier accordingly and ensure any uses of TrustTier handle
the Unknown case.

In `@crates/basilica-sdk-python/src/types.rs`:
- Around line 1417-1420: The current conversion silently maps any
non-"community" trust_tier to SdkTrustTier::Enterprise which hides typos/invalid
values; update the conversion for trust_tier (where req.trust_tier is mapped to
SdkTrustTier) to explicitly handle known strings (e.g., "community" ->
SdkTrustTier::Community, "enterprise" -> SdkTrustTier::Enterprise) and
return/propagate an error for any other value instead of defaulting, or
implement TryFrom<String> for SdkTrustTier and use that to validate and produce
a proper error type; ensure callers handle the Result/Option accordingly so
invalid inputs surface instead of silently becoming Enterprise.

In `@crates/basilica-sdk/src/client.rs`:
- Around line 323-326: The current invoke_confidential_stream function buffers
the entire SSE via response.text().await and returns Vec<String>, which prevents
incremental delivery; change it to an async streaming API (e.g., return an impl
Stream<Item = Result<String, Error>> or a tokio mpsc::Receiver) and parse the
HTTP body incrementally using response.bytes_stream() / chunked reads instead of
response.text().await, decode/decrypt each SSE payload as it arrives and yield
each decrypted line immediately; update invoke_confidential_stream signature and
its callers to consume the stream and keep the TODO comment's intent.
- Around line 275-278: Thread a required minimum TrustTier parameter into
invoke_confidential_bytes and invoke_confidential_stream signatures, then after
calling self.get_confidential_instance(workload_id).await? and before calling
descriptor.verify(...) or constructing/sealing the confidential envelope,
compare the retrieved descriptor's trust_tier (e.g. descriptor.trust_tier or
accessor) against the required TrustTier and return an error (mapped via
Self::map_e2ee_error or a new appropriate error variant) if it is lower than
required; ensure this check runs before any sealing/usage of the descriptor so
E2eeInstanceDescriptor::verify(...) remains for attestation/age while the
explicit trust-tier validation enforces caller-provided minimums.

In `@crates/basilica-sdk/src/types.rs`:
- Around line 795-797: The field `confidential` in the `CreateDeploymentRequest`
struct is currently annotated only with `#[serde(default)]`, causing
`"confidential": false` to be emitted on serialization and breaking
backward-compatible wire format; add a `skip_serializing_if` attribute and
helper `is_false` predicate so false values are omitted on serialize (keep
`#[serde(default)]` for deserialization), i.e., define a function
`is_false(value: &bool) -> bool` and annotate the `confidential` field with
`#[serde(default, skip_serializing_if = "is_false")]` to restore the previous
implicit-false behavior used in tests like
`test_create_deployment_with_resources`.

---

Nitpick comments:
In `@crates/basilica-cli/src/cli/handlers/deploy/templates/sglang.rs`:
- Around line 91-104: The template currently passes through ttl_seconds:
common.ttl which defaults to None, causing deployments to persist indefinitely;
update the handler that constructs this template so ttl_seconds is set to a
sensible default (e.g., DEFAULT_TTL_SECONDS) when common.ttl is None or
explicitly document in the CLI help for the deploy command that omitting --ttl
creates a persistent deployment; modify the code path that sets ttl_seconds (the
place using ttl_seconds and common.ttl) to use Some(DEFAULT_TTL_SECONDS) or
ensure the CLI flag parsing for --ttl forces an explicit value, and keep the
existing explicit fields like confidential: false and trust_tier: None
unchanged.

In `@crates/basilica-e2ee/src/lib.rs`:
- Around line 499-510: Add a short explanatory comment inside derive_aead_key
explaining why the HKDF salt is taken from the first HKDF_SALT_LEN bytes of the
ciphertext (ML-KEM ciphertext): note that the ciphertext is fresh per
encapsulation, the shared_secret provides entropy, and using the ciphertext
prefix as salt is intended for domain separation (not secrecy) and is
cryptographically acceptable; reference the salt variable, HKDF_SALT_LEN, and
the use of Hkdf::<Sha256>::new so reviewers/maintainers understand the design
choice.
- Around line 464-476: seal_stream_chunk currently relies on seal_payload to
generate a random 12-byte nonce per chunk which risks AES-GCM nonce collisions
at high chunk counts; change the API to use a deterministic per-chunk nonce
derived from a per-stream random salt plus a monotonic counter (e.g., 64-bit
sequence) instead of pure randomness: add a nonce parameter to
seal_payload/open_payload (or a new
seal_payload_with_nonce/open_payload_with_nonce), update seal_stream_chunk and
open_stream_chunk to accept/consume a sequence number (or produce a wrapper that
takes stream_salt and seq_num and derives a 12-byte nonce by combining
salt||seq_num and truncating or HKDF/AEAD-appropriate derivation), and ensure
AEAD_KEY_LEN/BASE64 usage is unchanged while keeping open_stream_chunk
derivation symmetric so sealing and opening use the exact same nonce derivation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bc63ff24-850b-41dc-8168-ad884227e655

📥 Commits

Reviewing files that changed from the base of the PR and between f08ae91 and 7a48a8a.

⛔ Files ignored due to path filters (3)
  • Cargo.lock is excluded by !**/*.lock
  • crates/basilica-protocol/src/gen/basilica.e2e.v1.rs is excluded by !**/gen/**
  • crates/basilica-protocol/src/gen/mod.rs is excluded by !**/gen/**
📒 Files selected for processing (18)
  • Cargo.toml
  • crates/basilica-cli/src/cli/commands.rs
  • crates/basilica-cli/src/cli/handlers/deploy/create.rs
  • crates/basilica-cli/src/cli/handlers/deploy/templates/openclaw.rs
  • crates/basilica-cli/src/cli/handlers/deploy/templates/sglang.rs
  • crates/basilica-cli/src/cli/handlers/deploy/templates/tau.rs
  • crates/basilica-cli/src/cli/handlers/deploy/templates/vllm.rs
  • crates/basilica-e2ee/Cargo.toml
  • crates/basilica-e2ee/src/lib.rs
  • crates/basilica-protocol/build.rs
  • crates/basilica-protocol/proto/e2e.proto
  • crates/basilica-protocol/src/lib.rs
  • crates/basilica-sdk-python/Cargo.toml
  • crates/basilica-sdk-python/src/lib.rs
  • crates/basilica-sdk-python/src/types.rs
  • crates/basilica-sdk/Cargo.toml
  • crates/basilica-sdk/src/client.rs
  • crates/basilica-sdk/src/types.rs

Comment on lines +592 to +601
#[derive(clap::Args, Debug, Clone, Default)]
pub struct ConfidentialOptions {
/// Request Night Shade confidential compute for this deployment.
#[arg(long)]
pub confidential: bool,

/// Night Shade trust tier (defaults to enterprise when --confidential is set).
#[arg(long, value_enum)]
pub trust_tier: Option<TrustTierArg>,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Help text claims default but code doesn't enforce it.

Line 598 states "defaults to enterprise when --confidential is set", but looking at create.rs:104-105, when --confidential is set without --trust-tier, the trust_tier field remains None. If the intent is to default to Enterprise, the logic should be updated:

🔧 Option A: Update create.rs to apply default

In crates/basilica-cli/src/cli/handlers/deploy/create.rs:

-        confidential: cmd.confidential.confidential || cmd.confidential.trust_tier.is_some(),
-        trust_tier: cmd.confidential.trust_tier.map(Into::into),
+        confidential: cmd.confidential.confidential || cmd.confidential.trust_tier.is_some(),
+        trust_tier: if cmd.confidential.confidential || cmd.confidential.trust_tier.is_some() {
+            Some(cmd.confidential.trust_tier.map(Into::into).unwrap_or(TrustTier::Enterprise))
+        } else {
+            None
+        },
🔧 Option B: Update help text to match current behavior
-    /// Night Shade trust tier (defaults to enterprise when --confidential is set).
+    /// Night Shade trust tier (optional, enables --confidential implicitly when set).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[derive(clap::Args, Debug, Clone, Default)]
pub struct ConfidentialOptions {
/// Request Night Shade confidential compute for this deployment.
#[arg(long)]
pub confidential: bool,
/// Night Shade trust tier (defaults to enterprise when --confidential is set).
#[arg(long, value_enum)]
pub trust_tier: Option<TrustTierArg>,
}
#[derive(clap::Args, Debug, Clone, Default)]
pub struct ConfidentialOptions {
/// Request Night Shade confidential compute for this deployment.
#[arg(long)]
pub confidential: bool,
/// Night Shade trust tier (optional, enables --confidential implicitly when set).
#[arg(long, value_enum)]
pub trust_tier: Option<TrustTierArg>,
}

Comment on lines +24 to +32
message E2eeInstanceDescriptor {
string workload_id = 1;
string instance_id = 2;
string trust_tier = 3;
string e2ee_public_key = 4;
AttestationEvidence attestation = 5;
repeated string nonces = 6;
repeated string capabilities = 7;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential deserialization failure for unknown trust_tier values.

The trust_tier field is a string in proto (line 27), but the Rust TrustTier enum (in crates/basilica-e2ee/src/lib.rs:52-57) only accepts "enterprise" or "community" with #[serde(rename_all = "snake_case")]. If the API returns an unknown value, deserialization will fail.

Consider adding a fallback variant to TrustTier:

#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, Default)]
#[serde(rename_all = "snake_case")]
pub enum TrustTier {
    Enterprise,
    Community,
    #[default]
    #[serde(other)]
    Unknown,
}

Alternatively, document and validate that only known values are ever sent from the server.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-protocol/proto/e2e.proto` around lines 24 - 32, The proto
field E2eeInstanceDescriptor.trust_tier is a string but the Rust TrustTier enum
(in crates/basilica-e2ee/src/lib.rs) only supports "enterprise" and "community",
so add a fallback variant to TrustTier (e.g., Unknown) and annotate it with
serde attributes (rename_all = "snake_case" and serde(other) plus a default) so
deserialization of unknown trust_tier values won’t fail; update TrustTier
accordingly and ensure any uses of TrustTier handle the Unknown case.

Comment on lines +1417 to +1420
trust_tier: req.trust_tier.as_deref().map(|value| match value {
"community" => SdkTrustTier::Community,
_ => SdkTrustTier::Enterprise,
}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent fallback to Enterprise for unrecognized trust tier values.

The conversion maps any non-"community" string to SdkTrustTier::Enterprise, including typos like "enterprize" or invalid values like "premium". Consider validating the input and raising an error for unrecognized values to prevent user confusion.

🛡️ Proposed fix with validation
             trust_tier: req.trust_tier.as_deref().map(|value| match value {
                 "community" => SdkTrustTier::Community,
-                _ => SdkTrustTier::Enterprise,
+                "enterprise" => SdkTrustTier::Enterprise,
+                other => panic!("Invalid trust_tier '{}', expected 'community' or 'enterprise'", other),
             }),

Alternatively, consider using TryFrom with a proper error type instead of panicking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-sdk-python/src/types.rs` around lines 1417 - 1420, The
current conversion silently maps any non-"community" trust_tier to
SdkTrustTier::Enterprise which hides typos/invalid values; update the conversion
for trust_tier (where req.trust_tier is mapped to SdkTrustTier) to explicitly
handle known strings (e.g., "community" -> SdkTrustTier::Community, "enterprise"
-> SdkTrustTier::Enterprise) and return/propagate an error for any other value
instead of defaulting, or implement TryFrom<String> for SdkTrustTier and use
that to validate and produce a proper error type; ensure callers handle the
Result/Option accordingly so invalid inputs surface instead of silently becoming
Enterprise.

Comment on lines +275 to +278
let descriptor = self.get_confidential_instance(workload_id).await?;
descriptor
.verify(DEFAULT_ATTESTATION_MAX_AGE_SECS)
.map_err(Self::map_e2ee_error)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd crates/basilica-sdk && wc -l src/client.rs

Repository: one-covenant/basilica

Length of output: 83


🏁 Script executed:

# Check the specific code sections mentioned in the review
sed -n '275,278p' crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 259


🏁 Script executed:

# Also check the second location (333-335)
sed -n '333,335p' crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 182


🏁 Script executed:

# Check the E2eeInstanceDescriptor::verify() in basilica-e2ee
sed -n '119,135p' crates/basilica-e2ee/src/lib.rs

Repository: one-covenant/basilica

Length of output: 611


🏁 Script executed:

# Search for TrustTier usage in the SDK to understand the API
rg "TrustTier" crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 47


🏁 Script executed:

# Search for trust_tier in the SDK files
rg "trust_tier" crates/basilica-sdk/src/

Repository: one-covenant/basilica

Length of output: 708


🏁 Script executed:

# Check the function signatures around those lines to understand context
sed -n '260,280p' crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 807


🏁 Script executed:

# Check context around line 333-335
sed -n '320,345p' crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 1238


🏁 Script executed:

# Check the method signatures for invoke_confidential_bytes and invoke_confidential_stream
sed -n '265,278p' crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 570


🏁 Script executed:

# Check invoke_confidential_stream signature
sed -n '324,340p' crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 756


🏁 Script executed:

# Search for any trust_tier validation or checks in client.rs
rg "trust_tier" crates/basilica-sdk/src/client.rs -A 3 -B 3

Repository: one-covenant/basilica

Length of output: 1122


🏁 Script executed:

# Check if there's a higher-level invoke method that might validate trust_tier
rg "invoke_confidential" crates/basilica-sdk/src/client.rs -B 2 -A 10 | head -60

Repository: one-covenant/basilica

Length of output: 1711


🏁 Script executed:

# Search for TrustTier definition and usage in e2ee
rg "trust_tier" crates/basilica-e2ee/src/ -A 2 -B 2

Repository: one-covenant/basilica

Length of output: 1412


Enforce trust tier validation before sealing confidential requests.

The descriptor.verify(DEFAULT_ATTESTATION_MAX_AGE_SECS) calls in both invoke_confidential_bytes (lines 275–278) and invoke_confidential_stream (lines 333–335) only validate attestation binding and age; they do not check the trust_tier field. Since E2eeInstanceDescriptor::verify() omits trust_tier validation, a lower-tier but otherwise valid descriptor is still accepted and used to seal secrets.

Thread a required/minimum TrustTier parameter into these invoke methods, retrieve it from the caller, and reject descriptor mismatches before constructing or sealing the confidential request envelope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-sdk/src/client.rs` around lines 275 - 278, Thread a required
minimum TrustTier parameter into invoke_confidential_bytes and
invoke_confidential_stream signatures, then after calling
self.get_confidential_instance(workload_id).await? and before calling
descriptor.verify(...) or constructing/sealing the confidential envelope,
compare the retrieved descriptor's trust_tier (e.g. descriptor.trust_tier or
accessor) against the required TrustTier and return an error (mapped via
Self::map_e2ee_error or a new appropriate error variant) if it is lower than
required; ensure this check runs before any sealing/usage of the descriptor so
E2eeInstanceDescriptor::verify(...) remains for attestation/age while the
explicit trust-tier validation enforces caller-provided minimums.

Comment on lines +323 to +326
/// Invoke a confidential workload in streaming mode and return decrypted SSE payload lines.
///
/// TODO: switch this to an async stream so clients do not need to buffer the full response.
pub async fn invoke_confidential_stream(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n crates/basilica-sdk/src/client.rs | sed -n '320,330p'

Repository: one-covenant/basilica

Length of output: 577


🏁 Script executed:

cat -n crates/basilica-sdk/src/client.rs | sed -n '375,425p'

Repository: one-covenant/basilica

Length of output: 2134


🏁 Script executed:

cat -n crates/basilica-sdk/src/client.rs | sed -n '323,380p'

Repository: one-covenant/basilica

Length of output: 2803


This "stream" API still buffers the entire SSE response.

response.text().await (line 379) waits for EOF before parsing, preventing clients from receiving decrypted chunks incrementally. The function returns Vec<String>, requiring all results to be buffered in memory before returning. For long-lived streams, this causes unbounded memory usage and client timeout risks. The codebase already recognizes this limitation in the TODO comment (line 325). This needs an actual incremental streaming API, not a function that collects all results into a Vec.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-sdk/src/client.rs` around lines 323 - 326, The current
invoke_confidential_stream function buffers the entire SSE via
response.text().await and returns Vec<String>, which prevents incremental
delivery; change it to an async streaming API (e.g., return an impl Stream<Item
= Result<String, Error>> or a tokio mpsc::Receiver) and parse the HTTP body
incrementally using response.bytes_stream() / chunked reads instead of
response.text().await, decode/decrypt each SSE payload as it arrives and yield
each decrypted line immediately; update invoke_confidential_stream signature and
its callers to consume the stream and keep the TODO comment's intent.

Comment on lines +795 to +797
/// Request Night Shade confidential compute for this deployment.
#[serde(default)]
pub confidential: bool,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the CreateDeploymentRequest struct around lines 795-800
sed -n '785,810p' crates/basilica-sdk/src/types.rs

Repository: one-covenant/basilica

Length of output: 952


🏁 Script executed:

# Check the broader context of CreateDeploymentRequest to see all serde attributes
sed -n '770,810p' crates/basilica-sdk/src/types.rs

Repository: one-covenant/basilica

Length of output: 1548


🏁 Script executed:

# Look at the test_create_deployment_with_resources test
rg -A 30 "test_create_deployment_with_resources" crates/basilica-sdk/src/client.rs

Repository: one-covenant/basilica

Length of output: 1220


🏁 Script executed:

# Check if trust_tier uses skip_serializing_if
rg -B 2 "trust_tier" crates/basilica-sdk/src/types.rs | head -20

Repository: one-covenant/basilica

Length of output: 606


🏁 Script executed:

# Search for other bool fields with serde attributes to understand the pattern
rg "pub.*bool" crates/basilica-sdk/src/types.rs -B 1 | head -40

Repository: one-covenant/basilica

Length of output: 993


Add skip_serializing_if to maintain backward-compatible serialization for confidential=false.

#[serde(default)] only affects deserialization. On serialization, this field is always emitted, so every non-confidential CreateDeploymentRequest now sends "confidential": false. This changes the wire contract and makes the exact body matcher in crates/basilica-sdk/src/client.rs::test_create_deployment_with_resources incomplete (the test is already including "suspended": false and "publicMetadata": false, confirming this pattern, but is missing "confidential": false).

To keep false values implicit, add:

Serde attribute fix
-    #[serde(default)]
+    #[serde(default, skip_serializing_if = "is_false")]
     pub confidential: bool,
fn is_false(value: &bool) -> bool {
    !*value
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/basilica-sdk/src/types.rs` around lines 795 - 797, The field
`confidential` in the `CreateDeploymentRequest` struct is currently annotated
only with `#[serde(default)]`, causing `"confidential": false` to be emitted on
serialization and breaking backward-compatible wire format; add a
`skip_serializing_if` attribute and helper `is_false` predicate so false values
are omitted on serialize (keep `#[serde(default)]` for deserialization), i.e.,
define a function `is_false(value: &bool) -> bool` and annotate the
`confidential` field with `#[serde(default, skip_serializing_if = "is_false")]`
to restore the previous implicit-false behavior used in tests like
`test_create_deployment_with_resources`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant