Skip to content

Conversation

@flemzord
Copy link
Member

Optimize query performance for single-ledger buckets by conditionally skipping the WHERE ledger = ? clause when a bucket contains only one ledger. This reduces unnecessary filtering and can provide 5-15% performance improvement in single-ledger deployments.

Optimize query performance for single-ledger buckets by conditionally
skipping the WHERE ledger = ? clause when a bucket contains only one
ledger. This reduces unnecessary filtering and can provide 5-15%
performance improvement in single-ledger deployments.

Implementation:
- Add singleLedgerOptimization cache to ledger Store
- Add CountLedgersInBucket to system store
- Detect single-ledger state on CreateLedger and OpenLedger
- Refactor all query builders to use conditional filtering

Changes:
- internal/storage/ledger/store.go: Add cache and helper methods
- internal/storage/system/store.go: Add CountLedgersInBucket
- internal/storage/driver/driver.go: Detect single-ledger state
- internal/storage/ledger/resource_*.go: Apply conditional filtering
- internal/storage/ledger/{accounts,logs,transactions}.go: Apply conditional filtering
@flemzord flemzord requested a review from a team as a code owner October 29, 2025 16:55
@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

Walkthrough

The pull request introduces a single-ledger optimization mechanism. Changes refactor ledger filtering across multiple query builders to conditionally apply filters based on cached ledger state, add infrastructure to track and update that state, and integrate trigger calls in driver initialization.

Changes

Cohort / File(s) Summary
Single-Ledger Optimization Infrastructure
internal/storage/ledger/store.go, internal/storage/system/store.go
Introduces singleLedgerOptimization struct with cache state and internal helpers (isSingleLedger, applyLedgerFilter, getLedgerFilterSQL, UpdateSingleLedgerState) to manage conditional ledger filtering. Adds public CountLedgersInBucket method to system store to count ledgers by bucket.
Driver Integration
internal/storage/driver/driver.go
Adds non-fatal post-processing calls in CreateLedger and OpenLedger to refresh single-ledger cache state via ledger counting in the associated bucket.
Query Builders: Ledger-Scoped Filtering
internal/storage/ledger/accounts.go, internal/storage/ledger/logs.go, internal/storage/ledger/transactions.go
Refactors query construction in DeleteAccountMetadata, ReadLogWithIdempotencyKey, and transaction methods (updateTxWithRetrieve, RevertTransaction, UpdateTransactionMetadata, DeleteTransactionMetadata) to replace hard-coded ledger equality filters with conditional applyLedgerFilter or getLedgerFilterSQL calls.
Query Builders: Resource Aggregations
internal/storage/ledger/resource_accounts.go, internal/storage/ledger/resource_aggregated_balances.go, internal/storage/ledger/resource_logs.go, internal/storage/ledger/resource_transactions.go, internal/storage/ledger/resource_volumes.go
Systematically replaces explicit ledger filters across accounts, moves, accounts_metadata, and accounts_volumes queries with centralized applyLedgerFilter helper. Includes restructuring of innerMostQuery in transactions and address array handling in volumes.

Sequence Diagram(s)

sequenceDiagram
    participant Driver as driver.go
    participant Store as ledger/store.go
    participant System as system/store.go
    participant DB as Database
    
    Driver->>Store: CreateLedger/OpenLedger
    Driver->>Store: UpdateSingleLedgerState(callback)
    Store->>System: CountLedgersInBucket(ctx, bucket)
    System->>DB: SELECT COUNT(*) FROM ledger WHERE bucket = ?
    DB-->>System: count
    System-->>Store: count or error (debug log)
    Store->>Store: Update singleLedgerCache.enabled
    
    Note over Store: Future queries check cache
    Driver->>Store: Query (e.g., ReadLog)
    Store->>Store: isSingleLedger() check
    alt Single ledger optimized
        Store->>DB: Query without ledger filter
    else Multi-ledger
        Store->>DB: Query with ledger filter (applyLedgerFilter)
    end
    DB-->>Store: result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Multiple heterogeneous refactorings: Query builders refactored across 9 files with distinct patterns (conditional filters, mutable query objects, helper function calls, subquery restructuring).
  • Centralized optimization logic: New caching and filtering helpers introduce state management that must be validated against query semantics across all call sites.
  • Critical files requiring attention:
    • internal/storage/ledger/resource_transactions.go — innerMostQuery restructuring and filtering logic changes require careful trace-through for correctness.
    • internal/storage/ledger/resource_volumes.go — Complex query with address array handling and multiple ledger filter applications.
    • internal/storage/ledger/store.go — Optimization state machine logic and filter helper implementations must be sound across all contexts.
    • internal/storage/ledger/transactions.go — Multiple fallback and update paths with conditional ledger filtering.

Poem

🐰 A cache springs forth, one ledger or more,
Queries now dance by a conditional door,
Where applyLedgerFilter hops through the night,
Filtering swift, optimization in sight—
One store to rule them all, light and tight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "fix: query oprimization with where" is related to the main change in the changeset, which involves optimizing query performance by conditionally applying WHERE ledger = ? filters based on single-ledger optimization state. The title refers to a real aspect of the change—query optimization related to WHERE clauses—and does convey meaningful information to a reviewer. However, the title contains a typo ("oprimization" instead of "optimization") and could be more specific about the single-ledger optimization mechanism that is the core of this change. Despite these quality issues, the title communicates the essential nature of the modification.
Description Check ✅ Passed The pull request description is directly related to the changeset and provides clear, meaningful context about the modification. It explicitly describes the optimization strategy (conditionally skipping the WHERE ledger = ? clause for single-ledger buckets), explains the motivation (reducing unnecessary filtering), and quantifies the expected benefit (5-15% performance improvement). The description accurately reflects the actual changes made across the storage layer files and the new optimization mechanism introduced in store.go.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/main/query-optimization-with-where

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5b2623d and ba0a774.

📒 Files selected for processing (11)
  • internal/storage/driver/driver.go (2 hunks)
  • internal/storage/ledger/accounts.go (1 hunks)
  • internal/storage/ledger/logs.go (1 hunks)
  • internal/storage/ledger/resource_accounts.go (6 hunks)
  • internal/storage/ledger/resource_aggregated_balances.go (4 hunks)
  • internal/storage/ledger/resource_logs.go (1 hunks)
  • internal/storage/ledger/resource_transactions.go (3 hunks)
  • internal/storage/ledger/resource_volumes.go (4 hunks)
  • internal/storage/ledger/store.go (4 hunks)
  • internal/storage/ledger/transactions.go (4 hunks)
  • internal/storage/system/store.go (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-04-29T11:24:28.923Z
Learnt from: gfyrag
PR: formancehq/ledger#892
File: internal/controller/ledger/controller_default.go:196-196
Timestamp: 2025-04-29T11:24:28.923Z
Learning: In the ledger Import function, it's critical to maintain proper log ID tracking by updating lastLogID with the current log.ID after each processed log, rather than setting it to nil. This ensures the system can properly validate the ordering of logs and prevent duplicate or out-of-order processing, which is essential for maintaining data integrity in the ledger.

Applied to files:

  • internal/storage/ledger/logs.go
🧬 Code graph analysis (2)
internal/storage/system/store.go (1)
internal/ledger.go (1)
  • Ledger (18-26)
internal/storage/ledger/store.go (2)
internal/ledger.go (1)
  • Ledger (18-26)
internal/storage/system/store.go (1)
  • Store (20-32)
🔇 Additional comments (9)
internal/storage/system/store.go (1)

65-74: Count query looks solid.

Thanks for adding the bucket-level count with proper error wrapping; this will plug neatly into the optimization hook.

internal/storage/ledger/logs.go (1)

121-129: Centralized ledger scoping looks good.

applyLedgerFilter keeps the idempotency lookup aligned with the single-ledger optimization while preserving the existing limit/idempotency guard. No issues spotted.

internal/storage/ledger/accounts.go (1)

89-99: Update path still respects ledger isolation.

Deferring the ledger predicate through getLedgerFilterSQL keeps multi-ledger buckets safe while letting single-ledger setups bypass the extra WHERE clause. Looks solid.

internal/storage/ledger/resource_aggregated_balances.go (1)

25-104: Ledger filter helper applied consistently.

Every branch (PIT, metadata, partial address) now routes through applyLedgerFilter, so the dataset stays ledger-scoped without redundant predicates. Implementation looks correct.

internal/storage/ledger/resource_transactions.go (1)

33-149: Transactions dataset keeps proper ledger scoping.

The helper-based filtering covers the base query, metadata history, and the effective volume expansion, matching the optimization intent with no functional regressions observed.

internal/storage/ledger/resource_volumes.go (1)

37-125: Volume queries now share the optimized ledger filter.

Both real-time and history code paths pick up applyLedgerFilter, keeping joins in sync and avoiding redundant WHERE clauses. Looks good to ship.

internal/storage/ledger/resource_accounts.go (3)

60-108: Balance resolution refactoring is correct.

The balance resolution logic correctly applies ledger filtering via applyLedgerFilter in both the PIT (line 80, moves) and non-PIT (line 85, accounts_volumes) branches. The removal of the explicit ledger filter from line 69 is appropriate since filtering is now handled consistently through the centralized helper.


28-58: Ledger filter refactoring is correct and consistently applied.

The applyLedgerFilter implementation correctly handles both scenarios:

  • Single-ledger deployments: returns query unmodified (optimization)
  • Multi-ledger deployments: applies WHERE tableAlias.ledger = ? filter

The refactoring in resource_accounts.go follows the same pattern used consistently across 8 files (transactions, logs, volumes, aggregated_balances, etc.) with 24+ invocations, all using appropriate table aliases.


114-160: Expand method refactoring is consistent and functional tests already validate this code.

The ledger filtering is correctly applied to both the moves (line 133) and accounts_volumes (line 148) relations. Functional tests in accounts_test.go already cover the Expand method for both "volumes" and "effectiveVolumes" properties, including PIT and non-PIT scenarios.

Comment on lines +299 to +307
func (store *Store) UpdateSingleLedgerState(ctx context.Context, countFunc func(ctx context.Context, bucketName string) (int, error)) error {
count, err := countFunc(ctx, store.ledger.Bucket)
if err != nil {
return fmt.Errorf("failed to count ledgers in bucket: %w", err)
}

store.singleLedgerCache.mu.Lock()
defer store.singleLedgerCache.mu.Unlock()
store.singleLedgerCache.enabled = (count == 1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Prevent stale single-ledger cache from leaking writes across ledgers.

Once UpdateSingleLedgerState flips enabled to true, this store stops adding ledger = ?. If another ledger is later added to the same bucket, any long-lived store that already cached enabled=true never revalidates and will continue skipping the filter forever. At that point operations such as UpdateTransactionMetadata, DeleteTransactionMetadata, etc. start touching rows belonging to the newly added ledger—data corruption. Please add an invalidation strategy (e.g., bucket-scoped shared state, versioning, or periodic/conditional re-count) so every store reliably disables the optimization as soon as the bucket stops being single-ledger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants