-
Notifications
You must be signed in to change notification settings - Fork 21
feat: enhance subscription management with metrics and reconnection #621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
Cargo.toml(3 hunks)magicblock-processor/src/executor/processing.rs(4 hunks)test-integration/Cargo.toml(2 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-11-13T09:38:43.804Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/locks.rs:64-102
Timestamp: 2025-11-13T09:38:43.804Z
Learning: In magicblock-processor's TransactionScheduler (scheduler/mod.rs line 59), the executor count is clamped to MAX_SVM_EXECUTORS (63) at initialization time, and executor IDs are assigned sequentially from 0 to count-1. This architectural guarantee ensures that executor IDs used in the bitmask-based AccountLock (scheduler/locks.rs) will always be within valid bounds for bit shifting operations, making runtime bounds checks unnecessary.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-26T16:54:39.084Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: test-manual/Cargo.toml:0-0
Timestamp: 2025-10-26T16:54:39.084Z
Learning: In the magicblock-validator repository, use git branch references (not commit hashes or tags) for the helius-laserstream dependency to allow automatic updates when the branch is pushed to.
Applied to files:
Cargo.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run_make_ci_lint
- GitHub Check: run_make_ci_test
- GitHub Check: Build Project
🔇 Additional comments (6)
test-integration/Cargo.toml (2)
100-101: Approve: ureq and url dependencies for test metrics collection.The addition of
ureq 2.9.6andurl 2.5.0as workspace dependencies is appropriate for the new metrics collection capability in test integration (as referenced in the PR summary for fetching monitored account counts). ureq 2.9.6 upgrades rustls to 0.22 and ring to 0.17, without introducing breaking changes. While ureq 2.9.6 is not the latest version (3.x or 2.12.x exist), it is suitable for test-only usage and maintains backward compatibility.
77-77: I need to gather more context about the pull request to assess stability. Let me check the PR status and details.Use a stable released version or main branch of solana-account, not a feature branch.
The revision
8f7050ais on the feature branchbmuddha/fix/pre-cow-checkrather than a stable release or main branch. The commit was made today, allowing no time for testing or validation. Feature branch dependencies introduce significant risk and should be reserved for development only—not production or CI/CD test suites.Replace this with either:
- A tagged release from the solana-account repository, or
- The
mainbranch revision if the fix is already mergedLines affected: test-integration/Cargo.toml 77, 109
Cargo.toml (3)
94-94: Confirm intent: log feature removal enables runtime logging in release builds.Removing the
release_max_level_infofeature from thelogdependency (line 94) changes compile-time behavior. Libraries should avoid using the max level features because they're global and can't be changed once they're set. This change allows all log levels (debug, trace) to be available at runtime in release builds, rather than being stripped at compile time.Confirm:
- Is this intentional to support enhanced observability for the new reconnection and metrics features described in the PR?
- Are there any performance or binary-size implications in production that should be documented?
218-218: Confirm url dependency is shared across workspaces.The
url = "2.5.0"dependency is already declared in the main workspace dependencies (line 218), so test-integration will correctly reuse it. This is consistent with the dependency management pattern and avoids duplication.
154-154: Solana-account revision 8f7050a is consistently referenced across all manifests and exists in the repository.Verification confirms that revision 8f7050a is uniformly specified in:
- Main Cargo.toml workspace dependencies (line 154)
- Main Cargo.toml patch.crates-io (line 230)
- test-integration/Cargo.toml (lines 77 and 109)
The revision exists in the solana-account repository on branch
bmuddha/fix/pre-cow-checkand is associated with PR #20. All references are consistent across manifests.magicblock-processor/src/executor/processing.rs (1)
10-21: New imports align with downstream usageImporting
ReadableAccountforlamports()andTransactionErrorfor the new gasless guard is consistent with later code; no issues here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
magicblock-chainlink/src/chainlink/mod.rs(6 hunks)magicblock-processor/Cargo.toml(1 hunks)magicblock-processor/src/executor/processing.rs(5 hunks)magicblock-processor/tests/fees.rs(2 hunks)test-kit/src/lib.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-chainlink/src/chainlink/mod.rsmagicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-14T09:56:14.047Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 564
File: test-integration/programs/flexi-counter/src/processor/call_handler.rs:122-125
Timestamp: 2025-10-14T09:56:14.047Z
Learning: The file test-integration/programs/flexi-counter/src/processor/call_handler.rs contains a test smart contract used for integration testing, not production code.
Applied to files:
magicblock-chainlink/src/chainlink/mod.rsmagicblock-processor/tests/fees.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
magicblock-chainlink/src/chainlink/mod.rsmagicblock-processor/src/executor/processing.rs
📚 Learning: 2025-11-07T14:20:31.457Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
Applied to files:
magicblock-chainlink/src/chainlink/mod.rs
📚 Learning: 2025-11-13T09:38:43.804Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/locks.rs:64-102
Timestamp: 2025-11-13T09:38:43.804Z
Learning: In magicblock-processor's TransactionScheduler (scheduler/mod.rs line 59), the executor count is clamped to MAX_SVM_EXECUTORS (63) at initialization time, and executor IDs are assigned sequentially from 0 to count-1. This architectural guarantee ensures that executor IDs used in the bitmask-based AccountLock (scheduler/locks.rs) will always be within valid bounds for bit shifting operations, making runtime bounds checks unnecessary.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-28T13:15:42.706Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 596
File: magicblock-processor/src/scheduler.rs:1-1
Timestamp: 2025-10-28T13:15:42.706Z
Learning: In magicblock-processor, transaction indexes were always set to 0 even before the changes in PR #596. The proper transaction indexing within slots will be addressed during the planned ledger rewrite.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-21T10:34:59.140Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-accounts-db/src/lib.rs:63-72
Timestamp: 2025-10-21T10:34:59.140Z
Learning: In magicblock-validator, the AccountsDb "stop-the-world" synchronizer is managed at the processor/executor level, not at the AccountsDb API level. Transaction executors in magicblock-processor hold a read lock (sync.read()) for the duration of each slot and release it only at slot boundaries, ensuring all account writes happen under the read lock. Snapshot operations acquire a write lock, blocking until all executors release their read locks. This pattern ensures mutual exclusion between writes and snapshots without requiring read guards in AccountsDb write APIs.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-11-07T13:09:52.253Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: test-kit/src/lib.rs:275-0
Timestamp: 2025-11-07T13:09:52.253Z
Learning: In test-kit, the transaction scheduler in ExecutionTestEnv is not expected to shut down during tests. Therefore, using `.unwrap()` in test helper methods like `schedule_transaction` is acceptable and will not cause issues in the test environment.
Applied to files:
test-kit/src/lib.rsmagicblock-processor/tests/fees.rs
🧬 Code graph analysis (2)
magicblock-chainlink/src/chainlink/mod.rs (5)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
new(134-158)magicblock-chainlink/src/chainlink/blacklisted_accounts.rs (1)
blacklisted_accounts(6-30)programs/magicblock/src/mutate_accounts/account_mod_data.rs (1)
id(128-135)magicblock-processor/tests/fees.rs (1)
ephemeral_balance_pda_from_payer(20-26)magicblock-metrics/src/metrics/mod.rs (1)
inc_undelegation_requested(454-456)
magicblock-processor/tests/fees.rs (1)
test-kit/src/lib.rs (3)
new_with_fee(98-146)new(80-82)new_with_payer_and_fees(84-88)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run_make_ci_lint
- GitHub Check: Build Project
- GitHub Check: run_make_ci_test
🔇 Additional comments (10)
magicblock-processor/Cargo.toml (1)
44-44: Dependency is justified and actively used in tests.The verification confirms
solana-keypairis imported and actively used inmagicblock-processor/tests/fees.rs. TheKeypair::new()calls at lines 329 and 368 demonstrate legitimate usage for creating test keypairs, confirming this is not a leftover or unnecessary dependency.magicblock-processor/tests/fees.rs (3)
4-4: LGTM: Imports are necessary for the new tests.Both imports are used in the new test functions below.
Also applies to: 6-6
316-361: LGTM: Test logic is correct.This test properly verifies that gasless mode can handle transactions with non-existent accounts in the instruction's account metas. The test structure follows the established pattern and correctly verifies both success and zero fee charging.
366-408: LGTM: Test correctly verifies gasless mode with non-existent fee payer.This test provides valuable edge case coverage by verifying that gasless mode can handle transactions where the fee payer account doesn't exist in the accounts database. The use of
unwrap_or_default()correctly handles the missing account case, and the test structure follows established patterns.magicblock-chainlink/src/chainlink/mod.rs (6)
1-4: LGTM: Import consolidation.The consolidated import of atomic primitives and Arc is clean and appropriate for the new counter functionality.
143-147: LGTM: Atomic counters for account classification.The AtomicU64 counters appropriately track different account categories during the removal operation, providing detailed instrumentation for the reset operation.
149-173: LGTM: Account classification logic is correct.The removal logic properly categorizes accounts into mutually exclusive groups with appropriate atomic counter increments. The special handling for feature-owned accounts (excluding them from the empty count) appears intentional for system account considerations.
325-340: LGTM: Enhanced trace logging with mark_empty_if_not_found details.The trace logging correctly includes the
mark_empty_if_not_foundparameter details, improving observability. The log level guard and macro are properly aligned.
367-380: LGTM: Undelegation tracking with metrics and proper logging.The undelegation flow correctly increments metrics at the request point and uses appropriate debug-level logging for operational visibility. The subscription setup and success confirmation are well-structured.
250-259: No issues found — function signature verification confirms correct usage.The verification confirms that
dlp::pda::ephemeral_balance_pda_from_payertakes two parameters (payer and an index), and line 255 correctly calls it with(feepayer, 0). Multiple usages across the codebase confirm this pattern. The fee payer handling logic for gasless transactions is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs(4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-chainlink/src/remote_account_provider/lru_cache.rs
📚 Learning: 2025-11-07T14:20:31.457Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
Applied to files:
magicblock-chainlink/src/remote_account_provider/lru_cache.rs
🧬 Code graph analysis (1)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (2)
magicblock-metrics/src/metrics/mod.rs (1)
inc_evicted_accounts_count(403-405)magicblock-chainlink/src/remote_account_provider/mod.rs (1)
new(275-330)
🔇 Additional comments (2)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (2)
5-5: LGTM: Metric tracking on eviction.The eviction counter provides useful observability for subscription management. The metric increment is correctly placed after the lock is released, avoiding any performance impact.
Also applies to: 83-83
119-137: LGTM: Cache state accessors for metrics.These three methods (
len(),never_evicted_accounts(),pubkeys()) correctly expose cache state for subscription metrics. Thepubkeys()method holds the lock while iterating, which is acceptable since it's only called periodically by the metrics updater (whenenable_subscription_metricsis true), not in hot paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
961-1015: Waiters may return Ok even when the actual fetch failed (error is dropped).If
fetch_new.is_empty(), this call only waits onawait_pending, setsresultto Ok(empty), and returns it; any error from the in-flight fetch isn’t propagated to waiters.Minimal, safe fix: after waiting, recompute the result against the full filtered set (bank hits resolve locally; no duplicate network fetches), so waiters see the same outcome.
- let pubkeys = pubkeys - .iter() - .filter(|p| !self.blacklisted_accounts.contains(p)) - .collect::<Vec<_>>(); + // Work with owned keys to allow recomputation for waiters + let filtered_pubkeys: Vec<Pubkey> = pubkeys + .iter() + .filter(|p| !self.blacklisted_accounts.contains(p)) + .copied() + .collect(); let mut await_pending = vec![]; let mut fetch_new = vec![]; - { + { let mut pending = self .pending_requests .lock() .expect("pending_requests lock poisoned"); - for pubkey in pubkeys { + for pubkey in &filtered_pubkeys { // Check synchronously if account is in bank and subscribed when it should be if let Some(account_in_bank) = self.accounts_bank.get_account(pubkey) { @@ - if let Some(requests) = pending.get_mut(pubkey) { + if let Some(requests) = pending.get_mut(pubkey) { let (sender, receiver) = oneshot::channel(); requests.push(sender); - await_pending.push((*pubkey, receiver)); + await_pending.push((*pubkey, receiver)); continue; } @@ - fetch_new.push(*pubkey); + fetch_new.push(*pubkey); } } // If we have accounts to fetch, delegate to the existing implementation // but notify all pending requests when done - let result = if !fetch_new.is_empty() { - self.fetch_and_clone_accounts( - &fetch_new, - mark_empty_if_not_found, - slot, - ) - .await - } else { - Ok(FetchAndCloneResult { - not_found_on_chain: vec![], - missing_delegation_record: vec![], - }) - }; + let result = if !fetch_new.is_empty() { + self.fetch_and_clone_accounts(&fetch_new, mark_empty_if_not_found, slot).await + } else { + // Another task fetched these accounts; recompute outcome cheaply against the full set + self.fetch_and_clone_accounts(&filtered_pubkeys, mark_empty_if_not_found, slot).await + }; @@ - for (pubkey, receiver) in await_pending { + for (pubkey, receiver) in await_pending { joinset.spawn(async move { if let Err(err) = receiver .await .inspect_err(|err| { warn!("FetchCloner::clone_accounts - RecvError occurred while awaiting account {}: {err:?}. This indicates the account fetch sender was dropped without sending a value.", pubkey); }) { // The sender was dropped, likely due to an error in the other request error!( "Failed to receive account from pending request: {err}" ); } }); }Follow-up (optional, future-proof): carry a small status in the oneshot (e.g.,
Result<(), ()>) so waiters can short‑circuit on error without recomputation.Also applies to: 1023-1037, 1062-1076
♻️ Duplicate comments (2)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (2)
984-1003: Redundant else-if: condition is tautological.Once the OR chain didn’t continue,
!self.is_watching(pubkey)is guaranteed; simplify toelsewith the debug log.- } else if !self.is_watching(pubkey) { + } else { debug!("Account {pubkey} should be watched but wasn't"); }
219-237: Tighten undelegation-completed detection to avoid false positives.Also require the update to be non-delegated before incrementing, for correctness under edge cases where ownership flips but delegation flag wasn’t cleared yet.
Apply:
- if let Some(in_bank) = - self.accounts_bank.get_account(&pubkey) - { - if in_bank.delegated() - && in_bank.owner().eq(&dlp::id()) - && !account.owner().eq(&dlp::id()) - { + if let Some(in_bank) = self.accounts_bank.get_account(&pubkey) { + if in_bank.delegated() + && in_bank.owner().eq(&dlp::id()) + && !account.delegated() + && !account.owner().eq(&dlp::id()) + { debug!( "Undelegation completed for account: {pubkey}" ); magicblock_metrics::metrics::inc_undelegation_completed(); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (9)
magicblock-aperture/src/tests.rs(1 hunks)magicblock-aperture/tests/setup.rs(1 hunks)magicblock-api/src/magic_validator.rs(1 hunks)magicblock-chainlink/src/chainlink/fetch_cloner.rs(7 hunks)magicblock-chainlink/src/chainlink/mod.rs(11 hunks)magicblock-chainlink/tests/utils/test_context.rs(1 hunks)test-integration/test-chainlink/src/ixtest_context.rs(1 hunks)test-integration/test-chainlink/src/test_context.rs(2 hunks)test-integration/test-config/tests/auto_airdrop_feepayer.rs(0 hunks)
💤 Files with no reviewable changes (1)
- test-integration/test-config/tests/auto_airdrop_feepayer.rs
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
📚 Learning: 2025-10-14T09:56:14.047Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 564
File: test-integration/programs/flexi-counter/src/processor/call_handler.rs:122-125
Timestamp: 2025-10-14T09:56:14.047Z
Learning: The file test-integration/programs/flexi-counter/src/processor/call_handler.rs contains a test smart contract used for integration testing, not production code.
Applied to files:
test-integration/test-chainlink/src/ixtest_context.rsmagicblock-aperture/src/tests.rsmagicblock-aperture/tests/setup.rs
📚 Learning: 2025-11-07T14:20:31.457Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
Applied to files:
test-integration/test-chainlink/src/test_context.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/src/chainlink/fetch_cloner.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
test-integration/test-chainlink/src/test_context.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/src/chainlink/fetch_cloner.rs
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-chainlink/src/chainlink/mod.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.
Applied to files:
magicblock-chainlink/src/chainlink/fetch_cloner.rs
🧬 Code graph analysis (3)
test-integration/test-chainlink/src/test_context.rs (2)
magicblock-chainlink/src/remote_account_provider/config.rs (2)
try_new_with_metrics(27-42)lifecycle_mode(51-53)magicblock-chainlink/src/remote_account_provider/mod.rs (3)
try_from_clients_and_mode(179-198)rpc_client(1060-1062)rpc_client(1072-1074)
magicblock-chainlink/src/chainlink/mod.rs (4)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
new(135-159)magicblock-chainlink/src/remote_account_provider/mod.rs (1)
new(275-330)magicblock-chainlink/src/chainlink/blacklisted_accounts.rs (1)
blacklisted_accounts(6-30)magicblock-metrics/src/metrics/mod.rs (1)
inc_undelegation_requested(454-456)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (2)
magicblock-metrics/src/metrics/mod.rs (1)
inc_undelegation_completed(458-460)magicblock-chainlink/src/remote_account_provider/config.rs (1)
try_new_with_metrics(27-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run_make_ci_lint
- GitHub Check: run_make_ci_test
- GitHub Check: Build Project
🔇 Additional comments (14)
magicblock-aperture/tests/setup.rs (1)
65-66: Constructor arity update looks correct.Passing 0 for the new auto_airdrop_lamports keeps tests deterministic. No issues.
magicblock-chainlink/tests/utils/test_context.rs (2)
70-76: Good: explicit RemoteAccountProviderConfig with metrics flag.Capacity > 0 and metrics disabled are appropriate for tests. LGTM.
104-110: Chainlink::try_new arity change applied correctly.Zero airdrop in tests is appropriate. LGTM.
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
1566-1573: Tests: config via try_new_with_metrics is good.Explicit capacity and metrics flag improve clarity. LGTM.
magicblock-chainlink/src/chainlink/mod.rs (5)
64-70: Constructor arity and field wiring: LGTM.New auto_airdrop_lamports is cleanly plumbed and stored.
Also applies to: 87-88
91-101: Endpoints constructor update: LGTM.Forwarding auto_airdrop_lamports keeps both constructors consistent.
Also applies to: 134-141
367-375: Trace includes mark_empty set: LGTM.The guard avoids string building at lower levels. Good hygiene.
403-406: Undelegation instrumentation/log levels: LGTM.Switch to debug and metric increment is appropriate.
Also applies to: 415-416
258-267: Gate auto-airdrop on lifecycle mode to prevent unintended funding in non-ephemeral clusters.The concern is valid: auto-airdrop with only
auto_airdrop_lamports > 0guard risks funding feepayers in Offline or Replica modes. However, there are two issues to resolve:
Location claim is incorrect: The suggested fix applies to ONE location (lines 275–295), not "268–273, 274–299". Lines 268–273 handle empty-if-not-found marking for the fee payer itself, which is separate.
Lifecycle accessibility is unclear: The current code doesn't expose
LifecycleModefrom thefetch_cloner. You'll need to verify:
- Can
RemoteAccountProviderexpose itslifecycle_modethrough a public getter?- Or should
LifecycleModebe stored directly in theChainlinkstruct?- Or embedded in
ChainlinkConfigand passed during construction?Confirm the implementation path, then apply the guard at the single airdrop location (lines 275–295): add
&& matches!(lifecycle_from_fetch_cloner(), LifecycleMode::Ephemeral)(or equivalent per your architecture decision).magicblock-aperture/src/tests.rs (1)
45-46: Constructor arity update looks correct.Tests keep airdrop disabled. LGTM.
test-integration/test-chainlink/src/ixtest_context.rs (1)
139-145: Constructor update: LGTM.Passing 0 keeps behavior unchanged in these tests.
test-integration/test-chainlink/src/test_context.rs (2)
70-76: Explicit RAP config with metrics flag: LGTM.Clearer than default path; capacity is sane; metrics disabled for tests.
108-113: Chainlink::try_new arity aligned.Zero airdrop keeps test behavior stable. LGTM.
magicblock-api/src/magic_validator.rs (1)
418-426: Propagating auto_airdrop_lamports: verified safe.Verification confirms defaults are correctly 0 for non-Ephemeral deployments. The field has
#[serde(default)]defaulting to 0, production configurations explicitly set it to 0, and the chainlink code guards activation withif self.auto_airdrop_lamports > 0. No risk of unintended airdrop in persistent deployments. Wiring at line 425 is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
magicblock-metrics/src/metrics/mod.rs (1)
178-209: Past review comment remains unaddressed.A previous review comment suggested adding clarifying documentation for the
ACCOUNT_FETCHES_*metrics to explain the relationship between SUCCESS/FAILED and FOUND/NOT_FOUND dimensions. This would help metric consumers understand that a successful RPC call can increment both SUCCESS and NOT_FOUND.magicblock-chainlink/src/remote_account_provider/mod.rs (1)
737-747: Consider restoring the evicted entry when unsubscribe fails.The rollback correctly removes the newly added
pubkeyfrom the LRU cache whenunsubscribe(evicted)fails. However, the evicted entry is lost from the cache even though we didn't successfully unsubscribe from it.As suggested in a past review comment, consider re-adding the evicted entry to restore the previous LRU state:
if let Err(err) = self.pubsub_client.unsubscribe(evicted).await { warn!( "Failed to unsubscribe from pubsub for evicted account {evicted}: {err:?}"); // Rollback the LRU add since eviction failed self.lrucache_subscribed_accounts.remove(pubkey); + self.lrucache_subscribed_accounts.add(evicted); return Err(err); }Because we just removed
pubkey, capacity is available, so re-addingevictedwon't cause another eviction and restores the cache contents.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
magicblock-chainlink/src/remote_account_provider/mod.rs(28 hunks)magicblock-committor-service/src/intent_executor/task_info_fetcher.rs(2 hunks)magicblock-metrics/src/metrics/mod.rs(9 hunks)magicblock-table-mania/Cargo.toml(1 hunks)magicblock-table-mania/src/lookup_table_rc.rs(2 hunks)magicblock-table-mania/src/manager.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-11-07T14:20:31.457Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
Applied to files:
magicblock-table-mania/src/lookup_table_rc.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
magicblock-chainlink/src/remote_account_provider/mod.rsmagicblock-metrics/src/metrics/mod.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.
Applied to files:
magicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-chainlink/src/remote_account_provider/mod.rs
🧬 Code graph analysis (5)
magicblock-table-mania/src/manager.rs (1)
magicblock-metrics/src/metrics/mod.rs (1)
inc_table_mania_a_count(491-493)
magicblock-committor-service/src/intent_executor/task_info_fetcher.rs (1)
magicblock-metrics/src/metrics/mod.rs (1)
inc_task_info_fetcher_a_count(487-489)
magicblock-table-mania/src/lookup_table_rc.rs (1)
magicblock-metrics/src/metrics/mod.rs (1)
inc_table_mania_cloase_a_count(495-497)
magicblock-chainlink/src/remote_account_provider/mod.rs (6)
magicblock-metrics/src/metrics/mod.rs (6)
inc_account_fetches_failed(463-465)inc_account_fetches_found(467-469)inc_account_fetches_not_found(471-473)inc_account_fetches_success(459-461)set_monitored_accounts_count(421-423)inc_remote_account_provider_a_count(483-485)magicblock-chainlink/src/remote_account_provider/lru_cache.rs (4)
new(26-35)pubkeys(39-42)pubkeys(131-137)len(119-125)magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (3)
new(46-54)new(311-322)try_new_from_url(165-180)magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (1)
new(126-158)magicblock-chainlink/src/submux/mod.rs (1)
new(147-158)magicblock-chainlink/src/remote_account_provider/config.rs (2)
default(65-72)try_new_with_metrics(27-42)
magicblock-metrics/src/metrics/mod.rs (4)
magicblock-chainlink/src/remote_account_provider/mod.rs (1)
new(278-333)magicblock-committor-service/src/intent_executor/task_info_fetcher.rs (1)
new(52-60)magicblock-table-mania/src/lookup_table_rc.rs (1)
new(48-55)magicblock-table-mania/src/manager.rs (1)
new(64-88)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build Project
- GitHub Check: run_make_ci_lint
🔇 Additional comments (10)
magicblock-table-mania/Cargo.toml (1)
17-17: LGTM!The addition of the
magicblock-metricsworkspace dependency enables metrics collection for table-mania components, aligning with the PR's observability objectives.magicblock-committor-service/src/intent_executor/task_info_fetcher.rs (1)
11-11: LGTM!The metrics instrumentation is correctly placed before the
get_multiple_accountsRPC call, providing visibility into fetch operations.Also applies to: 120-120
magicblock-table-mania/src/manager.rs (1)
11-11: LGTM!The metrics instrumentation correctly tracks
get_multiple_accounts_with_commitmentcalls in the remote table fetch loop, providing visibility into table lookup operations.Also applies to: 530-530
magicblock-chainlink/src/remote_account_provider/mod.rs (7)
1-77: LGTM: Imports and type definitions are well-structured.The addition of
hash_map::Entryfor the Entry API, metrics imports for observability, and theFetchResulttype alias for error propagation align well with the PR's objectives.
203-273: LGTM: Background metrics updater provides valuable reconciliation.The periodic task correctly tracks subscription count discrepancies between the LRU cache and pubsub client, with detailed debug logging for troubleshooting.
A past review comment suggested adding a
CancellationTokenfor graceful shutdown (currently the task is aborted on drop). This remains an optional future enhancement if clean shutdown becomes a requirement.
637-644: Excellent fix: Entry API prevents dropping concurrent fetch waiters.This correctly addresses the previous critical issue where
insertwas replacing in-flight entries and dropping earlier waiters. Using the Entry API ensures all concurrenttry_get_multicallers waiting for the same account will receive the fetch result.
664-682: Nested Result handling is correct; consider formatting the long warning.The logic correctly handles both the receiver error (
RecvError) and the inner fetch error (RemoteAccountProviderError).As noted in a past review comment, the warning at lines 674-675 is very long. Consider splitting it across multiple lines for readability (optional improvement).
852-1020: LGTM: Metrics integration and error handling are well-implemented.The
notify_errorhelper and retry macro provide clean error handling. Metrics are correctly updated for both success and failure paths:
inc_account_fetches_success,inc_account_fetches_found,inc_account_fetches_not_foundon successinc_account_fetches_failedon errorThe handling of both
JSON_RPC_SERVER_ERROR_MIN_CONTEXT_SLOT_NOT_REACHEDandHELIUS_CONTEXT_SLOT_NOT_REACHEDerror codes is appropriate.
804-827: LGTM: Unsubscribe now maintains LRU cache and pubsub client consistency.The function correctly removes from the LRU cache only after successful pubsub unsubscribe. If the pubsub unsubscribe fails, the LRU entry is preserved, ensuring the two stay in sync.
1157-1497: LGTM: Test configurations correctly disable metrics.All tests properly use
try_new_with_metricswithenable_subscription_metrics: falseto avoid spawning background metric updater tasks during testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
magicblock-chainlink/src/remote_account_provider/mod.rs (1)
839-872: Fetch retry/error handling and metrics updates are coherentThe refactored
fetchhelper now: (1) centralizes failure fan‑out vianotify_error, (2) uses a bounded retry loop withMAX_RETRIES, (3) explicitly handles both standard and HELIUS “min context slot not reached” error codes, and (4) updatesinc_account_fetches_success/failed/found/not_foundconsistently. The use offetching_accounts.removein both success and failure paths ensures all waiters are always resolved or cleaned up, avoiding hangs. One minor consideration:notify_errorlogs while holding thefetching_accountslock; if logs ever become very heavy here, it might be worth logging before/after the lock, but given this is an error path it’s acceptable as-is.Also applies to: 873-891, 892-977, 979-1022, 1034-1058
♻️ Duplicate comments (2)
magicblock-chainlink/src/remote_account_provider/mod.rs (1)
730-761: LRU + pubsub subscription/unsubscription sequencing is mostly solid; consider restoring evicted key on unsubscribe failureThe new
register_subscriptionflow (LRU add → unsubscribe evicted → send removal → subscribe new key) together withsubscribe/unsubscribenow keeps the LRU and pubsub client in sync in the happy path and rolls back the new key when unsubscribe/subscribe fails, which is a big improvement over the previous race. One remaining edge case: whenunsubscribe(evicted)fails, you remove the newly-addedpubkeyfrom the LRU but leave the previously-evicted key out of the LRU even though its pubsub subscription likely remains. That can leave the LRU’s view and the pubsub client (and metrics) out of sync. Consider, after removingpubkey, re-addingevictedto the LRU to fully restore the prior state before returning the error.Also applies to: 773-800, 801-829
magicblock-chainlink/src/chainlink/mod.rs (1)
148-199: Account-removal categorization and logging look correct; log label could better reflect feature-owned emptiesThe new
reset_accounts_banklogic correctly: keeps blacklisted and delegated accounts, removes DLP-owned accounts, and counts remaining non-delegated/non-blacklisted accounts into “empty” vs “non-empty” buckets usingsaturating_subto avoid underflow. Note thatnon_emptyincludes accounts withlamports == 0when owned byfeature::id()(since those are excluded fromremaining_empty). If you want the log message to match the actual set precisely, consider tweaking the label for thenon_emptybucket to mention that it also includes feature-owned zero‑lamport accounts.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
magicblock-chainlink/src/chainlink/mod.rs(11 hunks)magicblock-chainlink/src/remote_account_provider/mod.rs(28 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
magicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-07T14:20:31.457Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
Applied to files:
magicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.
Applied to files:
magicblock-chainlink/src/remote_account_provider/mod.rs
🧬 Code graph analysis (2)
magicblock-chainlink/src/chainlink/mod.rs (3)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
new(135-159)magicblock-chainlink/src/chainlink/blacklisted_accounts.rs (1)
blacklisted_accounts(6-30)magicblock-metrics/src/metrics/mod.rs (1)
inc_undelegation_requested(475-477)
magicblock-chainlink/src/remote_account_provider/mod.rs (5)
magicblock-metrics/src/metrics/mod.rs (6)
inc_account_fetches_failed(463-465)inc_account_fetches_found(467-469)inc_account_fetches_not_found(471-473)inc_account_fetches_success(459-461)set_monitored_accounts_count(421-423)inc_remote_account_provider_a_count(483-485)magicblock-chainlink/src/remote_account_provider/lru_cache.rs (4)
new(26-35)pubkeys(39-42)pubkeys(131-137)len(119-125)magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (3)
new(46-54)new(311-322)try_new_from_url(165-180)magicblock-chainlink/src/submux/mod.rs (1)
new(147-158)magicblock-chainlink/src/remote_account_provider/config.rs (2)
default(65-72)try_new_with_metrics(27-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run_make_ci_lint
- GitHub Check: run_make_ci_test
- GitHub Check: Build Project
🔇 Additional comments (6)
magicblock-chainlink/src/remote_account_provider/mod.rs (2)
410-503: FetchResult channel model and fetch-waiter deduplication look correctUsing
FetchResult = Result<RemoteAccount, RemoteAccountProviderError>across both subscription overrides and RPC fetch results is a nice cleanup and ensures errors propagate to callers. The switch toHashMap::entrywithVec<oneshot::Sender<FetchResult>>intry_get_multicorrectly appends additional waiters instead of clobbering in-flight entries, andlisten_for_account_updatesnow resolves all pending senders withOk(remote_account)when a newer subscription update arrives. Error handling onRecvErroris also sensible (warn + aggregate intoAccountResolutionsFailed). I don’t see correctness issues in this flow.Also applies to: 625-647, 660-681
1159-1164: Test updates to usetry_new_with_metrics(..., false)look appropriateSwitching the tests to construct
RemoteAccountProviderConfigviatry_new_with_metrics(capacity, LifecycleMode::Ephemeral, false)keeps the LRU capacity validation while explicitly disabling subscription metrics for unit tests. This aligns tests with the new config API without changing their behavior.Also applies to: 1210-1215, 1287-1293, 1494-1498
magicblock-chainlink/src/chainlink/mod.rs (4)
57-59: Auto-airdrop configuration plumbing is consistentAdding
auto_airdrop_lamportstoChainlinkand threading it throughtry_newandtry_new_from_endpointscleanly exposes the new behavior without impacting existing call sites beyond the extra argument. The field is stored on the struct and only read where needed, so there’s no unused state, and the#[allow(clippy::too_many_arguments)]ontry_new_from_endpointsis reasonable given the constructor’s role.Also applies to: 64-70, 91-101, 134-140
259-301: Fee-payer mark_empty behavior and auto-airdrop flow look reasonable; ensure airdrop helper is idempotentTreating the fee payer (and its balance PDA when
clone_escrowis true) asmark_empty_if_not_foundis a good fit for gasless flows and ensures remote fetches don’t fail just because those accounts don’t yet exist. The best-effort auto-airdrop that only triggers whenauto_airdrop_lamports > 0, aFetchCloneris available, and the local fee-payer balance is still zero is a sensible safety net. Just make sureFetchCloner::airdrop_account_if_emptyitself guards against races (e.g., re-checks remote balance) so repeated calls under load can’t accidentally over-airdrop.
363-377: Enhanced fetch logging withmark_empty_if_not_foundcontext is helpfulExtending
fetch_accounts_commontrace logging to include the set ofmark_empty_if_not_foundpubkeys makes it much easier to debug why some accounts are treated as “empty but acceptable.” The guard useslog::Level::Traceand the macro istrace!, so the enabled-level check and log macro are aligned.
400-417: Undelegation instrumentation is aligned with metrics and log levelsUpgrading
undelegation_requestedtodebug!logs and incrementingmagicblock_metrics::metrics::inc_undelegation_requested()on each call matches the semantics of that metric and keeps trace noise down. The subsequent subscription viafetch_cloner.subscribe_to_accountis unchanged, so behavior is preserved while observability improves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test-integration/test-cloning/tests/04_escrow_transfer.rs (1)
91-106: Critical: Contradictory assertions onescrow_balance.Lines 102 and 104 assert contradictory conditions on the same variable:
- Line 102:
escrow_balancemust be in range 0.4–0.5 SOL- Line 104:
escrow_balancemust be >= 1.0 SOLThese cannot both be true. Based on the comment at line 104 ("Airdropped 2 SOL - escrowed half"), it appears line 104 should check the payer balance (which received 2 SOL and escrowed ~1 SOL), not the escrow balance. However, the payer balance is currently discarded with
_on line 91.Apply this diff to capture and check the payer balance instead:
- let (counter_balance, _, escrow_balance) = log_accounts_balances( + let (counter_balance, payer_balance, escrow_balance) = log_accounts_balances( &ctx, "After transfer from escrow to counter", &counter_pda, &kp_escrowed.pubkey(), &ephemeral_balance_pda, ); let escrow_balance = escrow_balance as f64 / LAMPORTS_PER_SOL as f64; let counter_balance = counter_balance as f64 / LAMPORTS_PER_SOL as f64; + let payer_balance = payer_balance as f64 / LAMPORTS_PER_SOL as f64; // Received 1 SOL then transferred 0.5 SOL + tx fee assert!((0.4..=0.5).contains(&escrow_balance)); // Airdropped 2 SOL - escrowed half - assert!(escrow_balance >= 1.0); + assert!(payer_balance >= 1.0); // Received 0.5 SOL assert!((0.5..0.6).contains(&counter_balance));Minor: Inconsistent range operators. Line 102 uses an inclusive end (
..=) while line 106 uses an exclusive end (..). Consider using consistent operators unless the difference is intentional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
test-integration/test-cloning/tests/01_program-deploy.rs(1 hunks)test-integration/test-cloning/tests/04_escrow_transfer.rs(3 hunks)test-integration/test-cloning/tests/05_parallel-cloning.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
📚 Learning: 2025-10-14T09:56:14.047Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 564
File: test-integration/programs/flexi-counter/src/processor/call_handler.rs:122-125
Timestamp: 2025-10-14T09:56:14.047Z
Learning: The file test-integration/programs/flexi-counter/src/processor/call_handler.rs contains a test smart contract used for integration testing, not production code.
Applied to files:
test-integration/test-cloning/tests/04_escrow_transfer.rstest-integration/test-cloning/tests/05_parallel-cloning.rs
📚 Learning: 2025-11-07T13:09:52.253Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: test-kit/src/lib.rs:275-0
Timestamp: 2025-11-07T13:09:52.253Z
Learning: In test-kit, the transaction scheduler in ExecutionTestEnv is not expected to shut down during tests. Therefore, using `.unwrap()` in test helper methods like `schedule_transaction` is acceptable and will not cause issues in the test environment.
Applied to files:
test-integration/test-cloning/tests/05_parallel-cloning.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: run_make_ci_test
🔇 Additional comments (1)
test-integration/test-cloning/tests/05_parallel-cloning.rs (1)
239-239: No actionable changes needed—escrow test disabling is intentional and tracked.Verification confirms this change is part of a coordinated commit (3b9496a, dated Nov 18) titled "chore: ignore tests that depend on transfers from escrow accounts." This is not a PR side effect but the explicit purpose of this commit, which also modified
04_escrow_transfer.rs. The escrow functionality evaluation is tracked via issue #579 (as noted in the learnings).The ignore decision is intentional and properly scoped to escrow-dependent tests across multiple test files. While linking the commit or ignore messages to issue #579 would improve traceability, the change itself is appropriately coordinated and documented via commit message.
2e691d9 to
4d433fb
Compare
* master: chore: dependency update in Cargo.lock file fix: set caps for bincode and serde dependencies fix: capping solana program version due to transit dependency issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
♻️ Duplicate comments (4)
magicblock-processor/src/executor/processing.rs (1)
320-338: Verify fee-payer persistence and address zero-lamport deletion bug.The persistence filter at line 336 has two significant issues:
Zero-lamport deletions (confirmed bug): The condition
account.lamports() == 0 && !privilegedcauses bothinsert_accountandaccounts_tx.sendto be skipped. However, the testtest_zero_lamports_account()inmagicblock-accounts-db/src/tests.rs(lines 388–405) explicitly shows that zero-lamport accounts must be persisted as empty/escrow account markers. Non-privileged transactions that close accounts by setting lamports to 0 will lose this marker state, potentially causing the validator to re-fetch from chain on every access. This must be fixed: either remove the zero-lamport check for all transactions, or document why non-privileged closures should be treated differently.Notification vs. storage separation: The same condition gates both persistence (
insert_account) and observer notification (accounts_tx.send). External consumers relying onaccounts_txto detect account closures won't receive updates for non-privileged zero-lamport accounts. Clarify whether this asymmetry is intentional.Fee-payer dirty flag (requires verification): In the
FeesOnlybranch (lines 310–316), confirm thatRollbackAccounts::FeePayerOnlyfrom the Solana SVM crate guaranteesfee_payer_account.is_dirty()istruewhen fees are charged. If it's not guaranteed, line 336 will incorrectly skip persisting the fee deduction. Check the Solana SVM documentation or implementation for this guarantee.magicblock-chainlink/src/remote_account_provider/mod.rs (2)
840-873: Fetch retry/error path and metrics integration are well-structured (with one retriable-error scope caveat)Positives:
notify_errorcentralizes:
- logging of fatal RPC issues,
- incrementing
inc_account_fetches_failed(pubkeys.len() as u64),- and sending a consistent
Err(RemoteAccountProviderError::AccountResolutionsFailed(..))to all pending waiters for each pubkey.The
retry!macro cleanly handles transient conditions with a boundedMAX_RETRIESand backoff, never holding thefetching_accountslock across.awaitboundaries.Treating both:
JSON_RPC_SERVER_ERROR_MIN_CONTEXT_SLOT_NOT_REACHED, andHELIUS_CONTEXT_SLOT_NOT_REACHED
as retriable is a good improvement for heterogeneous providers.Success metrics (
inc_account_fetches_success,inc_account_fetches_found,inc_account_fetches_not_found) align with the actual RPC outcome and are only emitted once per successful response, while failures are counted viainc_account_fetches_failedinnotify_error, so you don’t double-count.One caveat (unchanged from earlier reviews): the
RpcError::ForUser(rpc_user_err)arm unconditionally callsretry!for all ForUser errors. The surrounding comment specifically talks about the “AccountNotFound / Minimum context slot has not been reached” case; retrying on every ForUser (including unrelated user-facing errors) could lead to unnecessary retries. If you want to tighten this without adding complex parsing, you could at least log the actualrpc_user_errat warn/error level before deciding whether it looks like a min-context-slot transient.Functionally this block is correct; the above is mainly about avoiding pointless retries on clearly non-transient ForUser errors.
Also applies to: 875-893, 898-899, 919-977, 984-1024, 1036-1061
49-56: Active-subscription metrics wiring is solid; minor logging guard nitThe background updater correctly:
- samples
lrucache_subscribed_accounts.len()vspubsub_client.subscription_count(Some(&never_evicted)),- computes diffs between the LRU and pubsub view, gated behind a debug-level check,
- and updates
set_monitored_accounts_count(pubsub_total)at a reasonable 60s interval.This is a good fit for reconciling client vs. cache state without adding per-request overhead.
One small nit: inside
start_active_subscriptions_updateryou guard expensive work withlog::log_enabled!(log::Level::Debug)but still emittrace!logs (All pubsub subscriptions: ...) under that guard. When Debug is enabled but Trace is not, you still pay the cost of collectingall_pubsub_subsand formatting the message that will never be logged at trace level. Consider either:
- switching the guard to
Level::Tracewhere you emittrace!, or- changing those particular
trace!calls todebug!so the guard and macro level match.This is purely a perf/readability nit; behavior is otherwise correct.
Also applies to: 61-67, 203-273
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (1)
267-275:test_never_evicted_accountsvalidates clock sysvar is protectedThe test correctly asserts that the clock sysvar is present in
never_evicted_accounts(), ensuring we won’t accidentally evict it via the LRU cache. This gives good coverage for the never‑evict set.You may still want a small unit test for
len()andpubkeys()(initial length, growth as keys are added, and verifying returned pubkeys) as previously suggested; not strictly required but would round out the API coverage.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs(4 hunks)magicblock-chainlink/src/remote_account_provider/mod.rs(27 hunks)magicblock-chainlink/src/testing/mod.rs(1 hunks)magicblock-processor/Cargo.toml(1 hunks)magicblock-processor/src/executor/processing.rs(6 hunks)test-integration/test-chainlink/tests/ix_06_redeleg_us_separate_slots.rs(4 hunks)test-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rs(3 hunks)test-integration/test-cloning/tests/01_program-deploy.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-chainlink/src/remote_account_provider/lru_cache.rsmagicblock-processor/src/executor/processing.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-07T14:20:31.457Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.457Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
Applied to files:
magicblock-chainlink/src/remote_account_provider/lru_cache.rstest-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rstest-integration/test-chainlink/tests/ix_06_redeleg_us_separate_slots.rsmagicblock-processor/src/executor/processing.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-18T08:47:39.681Z
Learnt from: Dodecahedr0x
Repo: magicblock-labs/magicblock-validator PR: 639
File: magicblock-chainlink/tests/04_redeleg_other_separate_slots.rs:158-165
Timestamp: 2025-11-18T08:47:39.681Z
Learning: In magicblock-chainlink tests involving compressed accounts, `set_remote_slot()` sets the slot of the `AccountSharedData`, while `compressed_account_shared_with_owner_and_slot()` sets the slot of the delegation record. These are two different fields and both calls are necessary.
Applied to files:
magicblock-chainlink/src/remote_account_provider/lru_cache.rstest-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rsmagicblock-chainlink/src/testing/mod.rstest-integration/test-chainlink/tests/ix_06_redeleg_us_separate_slots.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
magicblock-chainlink/src/remote_account_provider/lru_cache.rstest-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rstest-integration/test-chainlink/tests/ix_06_redeleg_us_separate_slots.rsmagicblock-processor/src/executor/processing.rsmagicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-13T09:38:43.804Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/locks.rs:64-102
Timestamp: 2025-11-13T09:38:43.804Z
Learning: In magicblock-processor's TransactionScheduler (scheduler/mod.rs line 59), the executor count is clamped to MAX_SVM_EXECUTORS (63) at initialization time, and executor IDs are assigned sequentially from 0 to count-1. This architectural guarantee ensures that executor IDs used in the bitmask-based AccountLock (scheduler/locks.rs) will always be within valid bounds for bit shifting operations, making runtime bounds checks unnecessary.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-28T13:15:42.706Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 596
File: magicblock-processor/src/scheduler.rs:1-1
Timestamp: 2025-10-28T13:15:42.706Z
Learning: In magicblock-processor, transaction indexes were always set to 0 even before the changes in PR #596. The proper transaction indexing within slots will be addressed during the planned ledger rewrite.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-21T10:34:59.140Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-accounts-db/src/lib.rs:63-72
Timestamp: 2025-10-21T10:34:59.140Z
Learning: In magicblock-validator, the AccountsDb "stop-the-world" synchronizer is managed at the processor/executor level, not at the AccountsDb API level. Transaction executors in magicblock-processor hold a read lock (sync.read()) for the duration of each slot and release it only at slot boundaries, ensuring all account writes happen under the read lock. Snapshot operations acquire a write lock, blocking until all executors release their read locks. This pattern ensures mutual exclusion between writes and snapshots without requiring read guards in AccountsDb write APIs.
Applied to files:
magicblock-processor/src/executor/processing.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.
Applied to files:
magicblock-chainlink/src/remote_account_provider/mod.rs
🧬 Code graph analysis (2)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (1)
magicblock-metrics/src/metrics/mod.rs (1)
inc_evicted_accounts_count(424-426)
test-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rs (1)
magicblock-validator/src/main.rs (1)
init_logger(13-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build Project
- GitHub Check: run_make_ci_test
🔇 Additional comments (10)
magicblock-processor/Cargo.toml (1)
44-44: LGTM!The
solana-keypairdev-dependency is properly formatted with consistent spacing ({ workspace = true }) matching the rest of the file. The addition supports testing in the processor crate, and workspace version management is appropriately applied.test-integration/test-cloning/tests/01_program-deploy.rs (1)
186-186: LGTM! Bump calculation fix properly applied.The fix from the previous review has been correctly implemented. The calculation now produces incrementing bump values (1, 2, 3, ..., 51) as retries progress, ensuring unique messages ("Hola Mundo 1", "Hola Mundo 2", etc.) for each retry attempt.
test-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rs (1)
8-9: LGTM: Import additions support new test functionality.The imports correctly add
assert_not_subscribedfor subscription assertions andsleep_msfor timing delays used below.magicblock-processor/src/executor/processing.rs (1)
353-367: LGTM! Clean helper for extracting rollback lamports.The
rollback_feepayer_lamportshelper correctly handles all threeRollbackAccountsvariants and cleanly extracts the feepayer's rollback lamports for comparison in the gasless mutation guard.magicblock-chainlink/src/remote_account_provider/lru_cache.rs (1)
5-5: Eviction metric placement looks correct
inc_evicted_accounts_count()is only called whenLruCache::pushactually evicts an entry (after we’ve ruled out the never‑evict set and promotion case), so the counter should reflect real evictions and not mere accesses or never‑evict keys. This matches the intended semantics for an eviction metric.Also applies to: 82-89
magicblock-chainlink/src/remote_account_provider/mod.rs (5)
63-67: Fetch waiters and Result-based channels now behave correctly under concurrencyThe switch to:
type FetchResult = Result<RemoteAccount, RemoteAccountProviderError>;FetchingAccounts = Mutex<HashMap<Pubkey, (u64, Vec<oneshot::Sender<FetchResult>>)>>;- and the Entry-based insertion in
try_get_multi,fixes the earlier issue where a second caller would overwrite the first caller’s waiter for the same account. Using
match fetching.entry(pubkey)to append newsenders ensures all in-flight callers for a givenPubkeyshare the same fetch result.Coupled with:
listen_for_account_updatesnow sendingOk(remote_account.clone())into all pending senders when a fresh subscription update wins the race, andfetch()sending eitherOk(remote_account.clone())orErr(...)to all pending senders,every waiter now receives exactly one
FetchResult(success, not-found, or error). Thetry_get_multiloop cleanly distinguishes:
Ok(Ok(remote_account))→ happy path,Ok(Err(err))→ logical fetch failure (e.g., RPC problems),Err(recv_err)→ channel-level failure, wrapped asRemoteAccountProviderError::RecvrError.This looks correct and should eliminate dropped waiters under concurrent load.
Also applies to: 410-503, 633-647, 660-682
286-303: LRU cache construction and metrics gating innew/try_new_from_urlslook correct
AccountsLruCacheis now constructed fromconfig.subscribed_accounts_lru_capacity()usingNonZeroUsize::new(cap).expect("non-zero capacity"), relying on the config constructor to reject zero (consistent withRemoteAccountProviderConfig::try_new_with_metrics).The background active-subscriptions updater is only started when
config.enable_subscription_metrics()is true, and the resultingJoinHandleis stored in_active_subscriptions_task_handleso it lives for the provider’s lifetime. Tests passfalseto avoid spawning this task.
try_new_from_urlswires up(Arc<ChainPubsubClientImpl>, mpsc::Receiver<()>)pairs and passes them intoSubMuxClient::new, which is then used to construct the provider with the same config, so the metrics flag affects both URL-based and client-based constructors uniformly.
promote_accountsnow delegates tolrucache_subscribed_accounts.promote_multi, which is consistent with the new LRU wrapper API.All of this is coherent with the new LRU + metrics design.
Also applies to: 311-316, 335-380, 382-384
861-873: Fetch metrics and high-level eviction tests provide good end-to-end coverage
The fetch path now records:
- failures via
inc_account_fetches_failedwhennotify_erroris called,- successes via
inc_account_fetches_success(pubkeys.len() as u64),- and splits found/not-found counts with
inc_account_fetches_foundandinc_account_fetches_not_found.This gives a clear metric picture of how RPC fetches behave without changing the functional contract.
The new high-level tests:
test_add_accounts_up_to_limit_no_eviction,test_eviction_order,- and
test_multiple_evictions_in_sequenceexercise end-to-end behavior of the LRU capacity, eviction order, and removal notifications via
removed_account_rx, not just the raw cache wrapper. They mirror the lower-level tests inlru_cache.rsand validate that the RemoteAccountProvider’s subscription machinery and eviction signals behave as expected.Using
RemoteAccountProviderConfig::try_new_with_metrics(..., false)in these tests neatly disables the metrics updater while still using the same config path as production.Also applies to: 984-1016, 1520-1617
1159-1174: Tests switched totry_new_with_metricskeep behavior while making metrics explicitAll the unit tests that construct a
RemoteAccountProvidernow use:RemoteAccountProviderConfig::try_new_with_metrics( 1000, // or accounts_capacity LifecycleMode::Ephemeral, false, // disable subscription metrics )?and pass
&configintoRemoteAccountProvider::new. This:
- keeps the previous functional behavior (no background metrics updater during tests),
- exercises the same constructor path that production uses for metrics-enabled setups,
- and makes the LRU capacity explicit in the tests that depend on eviction semantics.
This is a clean way to adapt the tests to the new configuration surface.
Also applies to: 1209-1226, 1289-1303, 1491-1501
731-756: Subscription/LRU atomicity improved, but eviction-unsubscribe failures can desync LRU vs pubsubThe changes here make an important improvement:
register_subscriptionnow callsself.pubsub_client.subscribe(*pubkey).await?before mutating the LRU, so we no longer mark an account as “watched” in the LRU if the upstream subscription fails.However, in the eviction branch:
if let Some(evicted) = self.lrucache_subscribed_accounts.add(*pubkey) { trace!("Evicting {pubkey}"); // 1. Unsubscribe from the account directly (LRU has already removed it) if let Err(err) = self.pubsub_client.unsubscribe(evicted).await { warn!("Failed to unsubscribe from pubsub for evicted account {evicted}: {err:?}"); } // 2. Inform upstream so it can remove it from the store self.send_removal_update(evicted).await?; }if
unsubscribe(evicted)fails, we still:
- keep
pubkeyin the LRU,- and send a removal update for
evicted,even though the pubsub layer may still be subscribed to
evicted. That can permanently desync:
- the LRU (and upstream store + metrics) which think
evictedis gone, from- the pubsub client, which may continue streaming updates and contributing to
subscription_count.Given you now have an active-subscriptions reconciler that expects LRU and pubsub to converge, this inconsistency will show up as persistent mismatches with no self-healing.
Consider treating a failed eviction-unsubscribe as an atomicity violation and rolling back:
- remove
pubkeyfrom the LRU again,- optionally attempt
unsubscribe(pubkey)to revert the new subscription,- and return an error instead of sending a removal update for
evicted.For example:
if let Some(evicted) = self.lrucache_subscribed_accounts.add(*pubkey) { trace!("Evicting {pubkey}"); - // 1. Unsubscribe from the account directly (LRU has already removed it) - if let Err(err) = self.pubsub_client.unsubscribe(evicted).await { - // Should we retry here? - warn!( - "Failed to unsubscribe from pubsub for evicted account {evicted}: {err:?}"); - } - - // 2. Inform upstream so it can remove it from the store - self.send_removal_update(evicted).await?; + // 1. Unsubscribe from the account directly (LRU has already removed it) + if let Err(err) = self.pubsub_client.unsubscribe(evicted).await { + warn!( + "Failed to unsubscribe from pubsub for evicted account {evicted}: {err:?}. \ + Rolling back eviction of {evicted} and subscription for {pubkey}." + ); + // Roll back: drop the newly inserted key from LRU so our view matches pubsub_client + self.lrucache_subscribed_accounts.remove(pubkey); + // Best-effort revert for the newly subscribed account; ignore failures here + let _ = self.pubsub_client.unsubscribe(*pubkey).await; + return Err(err); + } + + // 2. Inform upstream so it can remove it from the store + self.send_removal_update(evicted).await?; }That keeps the LRU, upstream store, and pubsub client aligned even in the rare case of an unsubscribe failure during eviction.
The
subscribe/unsubscribepublic methods themselves look good:subscribepromotes existing entries in the LRU, andunsubscribe:
- refuses to touch never‑evict accounts, and
- only removes from the LRU + sends a removal update after a successful pubsub unsubscribe, leaving state unchanged on failure.
Also applies to: 771-794, 797-832
| pub fn len(&self) -> usize { | ||
| let subs = self | ||
| .subscribed_accounts | ||
| .lock() | ||
| .expect("subscribed_accounts lock poisoned"); | ||
| subs.len() | ||
| } | ||
|
|
||
| pub fn never_evicted_accounts(&self) -> Vec<Pubkey> { | ||
| self.accounts_to_never_evict.iter().cloned().collect() | ||
| } | ||
|
|
||
| pub fn can_evict(&self, pubkey: &Pubkey) -> bool { | ||
| !self.accounts_to_never_evict.contains(pubkey) | ||
| } | ||
|
|
||
| pub fn pubkeys(&self) -> Vec<Pubkey> { | ||
| let subs = self | ||
| .subscribed_accounts | ||
| .lock() | ||
| .expect("subscribed_accounts lock poisoned"); | ||
| subs.iter().map(|(k, _)| *k).collect() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
New LRU cache accessors are simple and thread‑safe
len(), never_evicted_accounts(), can_evict(), and pubkeys() are straightforward wrappers over the internal Mutex<LruCache<Pubkey, ()>> / HashSet<Pubkey>. The locking and cloning behavior is correct, and given that pubkeys() and never_evicted_accounts() are primarily used in metrics/debug paths, the extra allocations are acceptable.
If these methods ever end up in a hot path, consider returning iterators or borrowing views instead of materializing Vec<Pubkey> on each call.
🤖 Prompt for AI Agents
magicblock-chainlink/src/remote_account_provider/lru_cache.rs around lines
119-141: The new accessors (len, never_evicted_accounts, can_evict, pubkeys) are
thread-safe and correct as written, so no code changes are required; leave them
as-is, but if these methods later appear in a hot path replace Vec allocations
with iterator/borrowed views (or return impl Iterator / slices) to avoid
repeated cloning.
| #[macro_export] | ||
| macro_rules! assert_cloned_as_delegated_with_retries { | ||
| ($cloner:expr, $pubkeys:expr, $retries:expr) => {{ | ||
| for pubkey in $pubkeys { | ||
| let mut account_opt = None; | ||
| for _ in 0..$retries { | ||
| account_opt = $cloner.get_account(pubkey); | ||
| if let Some(account) = &account_opt { | ||
| if account.delegated() { | ||
| break; | ||
| } | ||
| } | ||
| ::std::thread::sleep(::std::time::Duration::from_millis(100)); | ||
| } | ||
| let account = account_opt | ||
| .expect(&format!("Expected account {} to be cloned", pubkey)); | ||
| assert!( | ||
| account.delegated(), | ||
| "Expected account {} to be delegated", | ||
| pubkey | ||
| ); | ||
| } | ||
| }}; | ||
| ($cloner:expr, $pubkeys:expr, $slot:expr, $retries:expr) => {{ | ||
| for pubkey in $pubkeys { | ||
| let mut account_opt = None; | ||
| for _ in 0..$retries { | ||
| account_opt = $cloner.get_account(pubkey); | ||
| if let Some(account) = &account_opt { | ||
| if account.delegated() && account.remote_slot() == $slot { | ||
| break; | ||
| } | ||
| } | ||
| ::std::thread::sleep(::std::time::Duration::from_millis(100)); | ||
| } | ||
| let account = account_opt | ||
| .expect(&format!("Expected account {} to be cloned", pubkey)); | ||
| assert!( | ||
| account.delegated(), | ||
| "Expected account {} to be delegated", | ||
| pubkey | ||
| ); | ||
| assert_eq!( | ||
| account.remote_slot(), | ||
| $slot, | ||
| "Expected account {} to have remote slot {}", | ||
| pubkey, | ||
| $slot | ||
| ); | ||
| } | ||
| }}; | ||
| ($cloner:expr, $pubkeys:expr, $slot:expr, $owner:expr, $retries:expr) => {{ | ||
| use solana_account::ReadableAccount; | ||
| for pubkey in $pubkeys { | ||
| let mut account_opt = None; | ||
| for _ in 0..$retries { | ||
| account_opt = $cloner.get_account(pubkey); | ||
| if let Some(account) = &account_opt { | ||
| if account.delegated() | ||
| && account.remote_slot() == $slot | ||
| && account.owner() == &$owner | ||
| { | ||
| break; | ||
| } | ||
| } | ||
| ::std::thread::sleep(::std::time::Duration::from_millis(100)); | ||
| } | ||
| let account = account_opt | ||
| .expect(&format!("Expected account {} to be cloned", pubkey)); | ||
| assert!( | ||
| account.delegated(), | ||
| "Expected account {} to be delegated", | ||
| pubkey | ||
| ); | ||
| assert_eq!( | ||
| account.remote_slot(), | ||
| $slot, | ||
| "Expected account {} to have remote slot {}", | ||
| pubkey, | ||
| $slot | ||
| ); | ||
| assert_eq!( | ||
| account.owner(), | ||
| &$owner, | ||
| "Expected account {} to have owner {}", | ||
| pubkey, | ||
| $owner | ||
| ); | ||
| } | ||
| }}; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capture $slot/$owner once to avoid degenerate self-comparisons and repeated evaluation
The retry helpers are a good fit for eventual-consistency tests, but the way $slot and $owner are used can subtly weaken the assertions:
- At call sites you often pass expressions like
account.remote_slot()andprogram_flexi_counter::id(). - Inside the macro, you also bind a local
account(e.g.,if let Some(account) = &account_optand laterlet account = account_opt.expect(...)) and then use$slotand$ownerin conditions. - After expansion,
accountfrom the macro body shadows the caller’saccount, so expressions likeaccount.remote_slot() == $slotcan collapse intoaccount.remote_slot() == account.remote_slot(), and the finalassert_eq!(account.remote_slot(), $slot, ...)likewise becomes a tautology when$slotreferencesaccountfrom the invocation context. - This means that when callers pass something like
account.remote_slot(), the “slot must equal X” part of the assertion no longer really constrains anything; you’re effectively only checkingdelegated().
To ensure the macros actually assert “matches the slot/owner as seen at the call site” and to avoid repeated evaluation of slot/owner expressions, capture them once at the top of each overload and use local expected_* variables inside the loops and assertions:
#[macro_export]
macro_rules! assert_cloned_as_delegated_with_retries {
@@
($cloner:expr, $pubkeys:expr, $slot:expr, $retries:expr) => {{
- for pubkey in $pubkeys {
+ let expected_slot = $slot;
+ for pubkey in $pubkeys {
let mut account_opt = None;
for _ in 0..$retries {
account_opt = $cloner.get_account(pubkey);
if let Some(account) = &account_opt {
- if account.delegated() && account.remote_slot() == $slot {
+ if account.delegated() && account.remote_slot() == expected_slot {
break;
}
}
::std::thread::sleep(::std::time::Duration::from_millis(100));
}
let account = account_opt
.expect(&format!("Expected account {} to be cloned", pubkey));
assert!(
account.delegated(),
"Expected account {} to be delegated",
pubkey
);
assert_eq!(
account.remote_slot(),
- $slot,
+ expected_slot,
"Expected account {} to have remote slot {}",
pubkey,
- $slot
+ expected_slot
);
}
}};
($cloner:expr, $pubkeys:expr, $slot:expr, $owner:expr, $retries:expr) => {{
use solana_account::ReadableAccount;
- for pubkey in $pubkeys {
+ let expected_slot = $slot;
+ let expected_owner = $owner;
+ for pubkey in $pubkeys {
let mut account_opt = None;
for _ in 0..$retries {
account_opt = $cloner.get_account(pubkey);
if let Some(account) = &account_opt {
if account.delegated()
- && account.remote_slot() == $slot
- && account.owner() == &$owner
+ && account.remote_slot() == expected_slot
+ && account.owner() == &expected_owner
{
break;
}
}
::std::thread::sleep(::std::time::Duration::from_millis(100));
}
let account = account_opt
.expect(&format!("Expected account {} to be cloned", pubkey));
assert!(
account.delegated(),
"Expected account {} to be delegated",
pubkey
);
assert_eq!(
account.remote_slot(),
- $slot,
+ expected_slot,
"Expected account {} to have remote slot {}",
pubkey,
- $slot
+ expected_slot
);
assert_eq!(
account.owner(),
- &$owner,
+ &expected_owner,
"Expected account {} to have owner {}",
pubkey,
- $owner
+ expected_owner
);
}
}};
}You may also want to apply the same “capture once” pattern to the non-retrying assert_cloned_as_delegated! / assert_cloned_as_undelegated! macros for consistency, but the above change is the minimum to make the new retry helpers’ slot/owner checks meaningful and robust.
| @@ -1,4 +1,4 @@ | |||
| use log::error; | |||
| use log::*; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Replace wildcard import with specific log macros.
The wildcard import use log::*; is non-idiomatic and can cause namespace pollution. Prefer explicit imports of the specific macros you need (e.g., error, warn, info, debug, trace).
Apply this diff:
-use log::*;
+use log::{error, warn};Adjust the list to include any additional log macros actually used in this file.
🤖 Prompt for AI Agents
In magicblock-processor/src/executor/processing.rs around line 1, replace the
wildcard import `use log::*;` with explicit imports of only the logging macros
this file uses (for example: `error`, `warn`, `info`, `debug`, `trace`); edit
the use statement to list those macros explicitly (and add/remove macros to
match actual usages in the file) to avoid namespace pollution and follow
idiomatic Rust.
| let mut result = output.processing_results.pop().expect( | ||
| "single transaction result is always present in the output", | ||
| ); | ||
|
|
||
| let undelegated_feepayer_was_modified = result | ||
| .as_ref() | ||
| .ok() | ||
| .and_then(|r| r.executed_transaction()) | ||
| .and_then(|txn| { | ||
| let first_acc = txn.loaded_transaction.accounts.first(); | ||
| let rollback_lamports = rollback_feepayer_lamports( | ||
| &txn.loaded_transaction.rollback_accounts, | ||
| ); | ||
| first_acc.map(|acc| (acc, rollback_lamports)) | ||
| }) | ||
| .map(|(acc, rollback_lamports)| { | ||
| // The check logic: if we have an undelegated feepayer, then | ||
| // it cannot have been mutated. The only exception is the | ||
| // privileged feepayer (internal validator operations), for | ||
| // which we do allow the mutations, since it can be used to | ||
| // fund other accounts. | ||
| (acc.1.is_dirty() | ||
| && (acc.1.lamports() != 0 || rollback_lamports != 0)) | ||
| && !acc.1.delegated() | ||
| && !acc.1.privileged() | ||
| }) | ||
| .unwrap_or_default(); | ||
| let gasless = self.environment.fee_lamports_per_signature == 0; | ||
| // If we are running in the gasless mode, we should not allow | ||
| // any mutation of the feepayer account, since that would make | ||
| // it possible for malicious actors to peform transfer operations | ||
| // from undelegated feepayers to delegated accounts, which would | ||
| // result in validator loosing funds upon balance settling. | ||
| if gasless && undelegated_feepayer_was_modified { | ||
| result = Err(TransactionError::InvalidAccountForFee); | ||
| }; | ||
| (result, output.balances) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move gasless check before computing feepayer mutation to avoid hot-path overhead.
The undelegated_feepayer_was_modified logic (lines 160-182) runs for every transaction, even when gasless is false. Since this guard only matters in gasless mode, checking gasless first and short-circuiting will avoid unnecessary work on the hot path.
Additionally, fix the typos in the comment:
- Line 186: "peform" → "perform"
- Line 188: "loosing" → "losing"
Apply this diff:
let mut result = output.processing_results.pop().expect(
"single transaction result is always present in the output",
);
+ let gasless = self.environment.fee_lamports_per_signature == 0;
let undelegated_feepayer_was_modified = result
.as_ref()
.ok()
.and_then(|r| r.executed_transaction())
.and_then(|txn| {
let first_acc = txn.loaded_transaction.accounts.first();
let rollback_lamports = rollback_feepayer_lamports(
&txn.loaded_transaction.rollback_accounts,
);
first_acc.map(|acc| (acc, rollback_lamports))
})
.map(|(acc, rollback_lamports)| {
// The check logic: if we have an undelegated feepayer, then
// it cannot have been mutated. The only exception is the
// privileged feepayer (internal validator operations), for
// which we do allow the mutations, since it can be used to
// fund other accounts.
(acc.1.is_dirty()
&& (acc.1.lamports() != 0 || rollback_lamports != 0))
&& !acc.1.delegated()
&& !acc.1.privileged()
})
.unwrap_or_default();
- let gasless = self.environment.fee_lamports_per_signature == 0;
// If we are running in the gasless mode, we should not allow
// any mutation of the feepayer account, since that would make
- // it possible for malicious actors to peform transfer operations
+ // it possible for malicious actors to perform transfer operations
// from undelegated feepayers to delegated accounts, which would
- // result in validator loosing funds upon balance settling.
+ // result in validator losing funds upon balance settling.
if gasless && undelegated_feepayer_was_modified {
result = Err(TransactionError::InvalidAccountForFee);
};Better still, wrap the entire computation in an if gasless { ... } block:
let mut result = output.processing_results.pop().expect(
"single transaction result is always present in the output",
);
- let undelegated_feepayer_was_modified = result
- .as_ref()
- .ok()
- .and_then(|r| r.executed_transaction())
- .and_then(|txn| {
- let first_acc = txn.loaded_transaction.accounts.first();
- let rollback_lamports = rollback_feepayer_lamports(
- &txn.loaded_transaction.rollback_accounts,
- );
- first_acc.map(|acc| (acc, rollback_lamports))
- })
- .map(|(acc, rollback_lamports)| {
- // The check logic: if we have an undelegated feepayer, then
- // it cannot have been mutated. The only exception is the
- // privileged feepayer (internal validator operations), for
- // which we do allow the mutations, since it can be used to
- // fund other accounts.
- (acc.1.is_dirty()
- && (acc.1.lamports() != 0 || rollback_lamports != 0))
- && !acc.1.delegated()
- && !acc.1.privileged()
- })
- .unwrap_or_default();
let gasless = self.environment.fee_lamports_per_signature == 0;
// If we are running in the gasless mode, we should not allow
// any mutation of the feepayer account, since that would make
- // it possible for malicious actors to peform transfer operations
+ // it possible for malicious actors to perform transfer operations
// from undelegated feepayers to delegated accounts, which would
- // result in validator loosing funds upon balance settling.
- if gasless && undelegated_feepayer_was_modified {
- result = Err(TransactionError::InvalidAccountForFee);
- };
+ // result in validator losing funds upon balance settling.
+ if gasless {
+ let undelegated_feepayer_was_modified = result
+ .as_ref()
+ .ok()
+ .and_then(|r| r.executed_transaction())
+ .and_then(|txn| {
+ let first_acc = txn.loaded_transaction.accounts.first();
+ let rollback_lamports = rollback_feepayer_lamports(
+ &txn.loaded_transaction.rollback_accounts,
+ );
+ first_acc.map(|acc| (acc, rollback_lamports))
+ })
+ .map(|(acc, rollback_lamports)| {
+ (acc.1.is_dirty()
+ && (acc.1.lamports() != 0 || rollback_lamports != 0))
+ && !acc.1.delegated()
+ && !acc.1.privileged()
+ })
+ .unwrap_or(false);
+
+ if undelegated_feepayer_was_modified {
+ result = Err(TransactionError::InvalidAccountForFee);
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let mut result = output.processing_results.pop().expect( | |
| "single transaction result is always present in the output", | |
| ); | |
| let undelegated_feepayer_was_modified = result | |
| .as_ref() | |
| .ok() | |
| .and_then(|r| r.executed_transaction()) | |
| .and_then(|txn| { | |
| let first_acc = txn.loaded_transaction.accounts.first(); | |
| let rollback_lamports = rollback_feepayer_lamports( | |
| &txn.loaded_transaction.rollback_accounts, | |
| ); | |
| first_acc.map(|acc| (acc, rollback_lamports)) | |
| }) | |
| .map(|(acc, rollback_lamports)| { | |
| // The check logic: if we have an undelegated feepayer, then | |
| // it cannot have been mutated. The only exception is the | |
| // privileged feepayer (internal validator operations), for | |
| // which we do allow the mutations, since it can be used to | |
| // fund other accounts. | |
| (acc.1.is_dirty() | |
| && (acc.1.lamports() != 0 || rollback_lamports != 0)) | |
| && !acc.1.delegated() | |
| && !acc.1.privileged() | |
| }) | |
| .unwrap_or_default(); | |
| let gasless = self.environment.fee_lamports_per_signature == 0; | |
| // If we are running in the gasless mode, we should not allow | |
| // any mutation of the feepayer account, since that would make | |
| // it possible for malicious actors to peform transfer operations | |
| // from undelegated feepayers to delegated accounts, which would | |
| // result in validator loosing funds upon balance settling. | |
| if gasless && undelegated_feepayer_was_modified { | |
| result = Err(TransactionError::InvalidAccountForFee); | |
| }; | |
| (result, output.balances) | |
| let mut result = output.processing_results.pop().expect( | |
| "single transaction result is always present in the output", | |
| ); | |
| let gasless = self.environment.fee_lamports_per_signature == 0; | |
| // If we are running in the gasless mode, we should not allow | |
| // any mutation of the feepayer account, since that would make | |
| // it possible for malicious actors to perform transfer operations | |
| // from undelegated feepayers to delegated accounts, which would | |
| // result in validator losing funds upon balance settling. | |
| if gasless { | |
| let undelegated_feepayer_was_modified = result | |
| .as_ref() | |
| .ok() | |
| .and_then(|r| r.executed_transaction()) | |
| .and_then(|txn| { | |
| let first_acc = txn.loaded_transaction.accounts.first(); | |
| let rollback_lamports = rollback_feepayer_lamports( | |
| &txn.loaded_transaction.rollback_accounts, | |
| ); | |
| first_acc.map(|acc| (acc, rollback_lamports)) | |
| }) | |
| .map(|(acc, rollback_lamports)| { | |
| (acc.1.is_dirty() | |
| && (acc.1.lamports() != 0 || rollback_lamports != 0)) | |
| && !acc.1.delegated() | |
| && !acc.1.privileged() | |
| }) | |
| .unwrap_or(false); | |
| if undelegated_feepayer_was_modified { | |
| result = Err(TransactionError::InvalidAccountForFee); | |
| } | |
| } | |
| (result, output.balances) |
| assert_cloned_as_delegated_with_retries, assert_cloned_as_undelegated, | ||
| assert_not_subscribed, assert_subscribed_without_delegation_record, | ||
| testing::init_logger, | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test updates correctly use the retry macro, but depend on accurate slot/owner checks in the helper
- Switching to
assert_cloned_as_delegated_with_retries!(..., account.remote_slot(), program_flexi_counter::id(), 30)in both delegated phases plus the addedsleep_ms(1_500).awaitdelays is a good way to harden this flow against eventual-consistency/pubsub lag. - With the retry interval fixed at 100ms, your worst-case wait per phase is ~4.5s (1.5s initial sleep + 30 * 100ms), which is acceptable for this integration test but worth keeping in mind if you extend this pattern to many tests.
One caveat: the correctness of the slot/owner checks here hinges on the helper macro capturing the slot/owner expressions as they are at the call site. As written, the macro can accidentally compare values against themselves when $slot / $owner expressions reference account, so please apply the expected_slot / expected_owner capture refactor in assert_cloned_as_delegated_with_retries! (see previous comment) to ensure these assertions genuinely validate the remote slot and owner rather than only delegated().
Also applies to: 40-46, 62-62, 80-80, 84-90
🤖 Prompt for AI Agents
In test-integration/test-chainlink/tests/ix_06_redeleg_us_separate_slots.rs
around lines 8-11 (and also affecting lines 40-46, 62, 80, 84-90), the test
switches to using assert_cloned_as_delegated_with_retries! with
account.remote_slot()/account.owner() but the macro currently evaluates those
expressions in place causing it to potentially compare values against
themselves; refactor the macro to capture the caller expressions into local
bindings like expected_slot and expected_owner at the start of the macro
(evaluate and bind $slot and $owner once), then use those bindings in all
subsequent comparisons and retries so the assertions validate the remote
slot/owner against the originally passed expected values; ensure the macro still
uses the retry loop and sleep delays as before and update all call sites listed
to rely on the new captured names.
| use solana_sdk::{signature::Keypair, signer::Signer}; | ||
| use test_chainlink::ixtest_context::IxtestContext; | ||
|
|
||
| #[ignore = "Started failing when fixing excessive subs, last time passing ded9c50a"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Test disabled rather than fixed after subscription management changes.
Ignoring this test means the "redelegate to us in same slot" flow is no longer validated. The failure reason explicitly ties to the subscription fixes in this PR, suggesting the new subscription management may have introduced a regression in this flow.
Action required:
- Investigate why the subscription changes broke this test
- Either fix the test expectations to match new behavior, or
- Fix the subscription management code to handle this flow correctly
Disabled tests provide zero value and can hide production bugs.
| info!("1. Account delegated to us"); | ||
|
|
||
| ctx.chainlink.ensure_accounts(&pubkeys, None).await.unwrap(); | ||
| sleep_ms(500).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major: Hard-coded sleep introduces fragile timing dependency.
The 500ms sleep calls are a brittle workaround for asynchronous pubsub updates. This approach has several problems:
- Flakiness risk: 500ms may be insufficient on slower CI systems or under load
- Inefficiency: May be excessive on fast systems, unnecessarily slowing test suite
- Logical inconsistency: The test asserts accounts delegated to "us" should NOT be subscribed (
assert_not_subscribed!), yet adds waits for "pubsub subscription handler". If these accounts aren't subscribed, why wait for subscription events?
Recommended approach:
Instead of arbitrary sleeps, use event-driven synchronization:
- Add a method to poll/wait for subscription state to stabilize
- Use condition variables or channels to signal when pubsub updates complete
- Consider exposing a "wait_for_subscriptions_settled()" test helper
Example pattern:
// Instead of:
ctx.chainlink.ensure_accounts(&pubkeys, None).await.unwrap();
sleep_ms(500).await;
// Use:
ctx.chainlink.ensure_accounts(&pubkeys, None).await.unwrap();
ctx.chainlink.wait_for_subscription_state_stable().await;This would provide deterministic, reliable tests without arbitrary timing.
Also applies to: 63-65
🤖 Prompt for AI Agents
In test-integration/test-chainlink/tests/ix_07_redeleg_us_same_slot.rs around
lines 37 and 63-65, the test uses hard-coded sleep_ms(500).await which creates
fragile timing dependencies; replace these sleeps with an event-driven wait: add
or use a helper like
wait_for_subscription_state_stable()/wait_for_pubsub_updates() on ctx.chainlink
that polls the subscription state or awaits a channel/condition variable
signaled by the pubsub handler, call that helper immediately after
ensure_accounts(...) and any operations that trigger pubsub updates, and remove
the fixed sleep calls so the test deterministically waits until subscriptions
have settled before asserting with assert_not_subscribed!.
Summary by CodeRabbit
New Features
Bug Fixes
Improvements
Enhances subscription management and reliability by adding metrics, and robust reconnection logic with automatic resubscription.
Details
Subscription Management Improvements
Metrics and Monitoring
subscription_countmethod toChainPubsubClienttrait for tracking active subscriptions across all clientsRemoteAccountProviderConfigReconnection and Reliability
ReconnectableClienttraitTesting and Integration
07_subscription_limits.rsto test large-scale subscription scenarios (400 accounts)Code Quality
ReconnectableClienttrait for better abstractionureqandurlfor metrics fetching in integration tests