fix: query oprimization with where #1092

flemzord · 2025-10-29T16:55:24Z

Optimize query performance for single-ledger buckets by conditionally skipping the WHERE ledger = ? clause when a bucket contains only one ledger. This reduces unnecessary filtering and can provide 5-15% performance improvement in single-ledger deployments.

Optimize query performance for single-ledger buckets by conditionally skipping the WHERE ledger = ? clause when a bucket contains only one ledger. This reduces unnecessary filtering and can provide 5-15% performance improvement in single-ledger deployments. Implementation: - Add singleLedgerOptimization cache to ledger Store - Add CountLedgersInBucket to system store - Detect single-ledger state on CreateLedger and OpenLedger - Refactor all query builders to use conditional filtering Changes: - internal/storage/ledger/store.go: Add cache and helper methods - internal/storage/system/store.go: Add CountLedgersInBucket - internal/storage/driver/driver.go: Detect single-ledger state - internal/storage/ledger/resource_*.go: Apply conditional filtering - internal/storage/ledger/{accounts,logs,transactions}.go: Apply conditional filtering

coderabbitai · 2025-10-29T16:55:35Z

Walkthrough

The pull request introduces a single-ledger optimization mechanism. Changes refactor ledger filtering across multiple query builders to conditionally apply filters based on cached ledger state, add infrastructure to track and update that state, and integrate trigger calls in driver initialization.

Changes

Cohort / File(s)	Summary
Single-Ledger Optimization Infrastructure `internal/storage/ledger/store.go`, `internal/storage/system/store.go`	Introduces `singleLedgerOptimization` struct with cache state and internal helpers (`isSingleLedger`, `applyLedgerFilter`, `getLedgerFilterSQL`, `UpdateSingleLedgerState`) to manage conditional ledger filtering. Adds public `CountLedgersInBucket` method to system store to count ledgers by bucket.
Driver Integration `internal/storage/driver/driver.go`	Adds non-fatal post-processing calls in `CreateLedger` and `OpenLedger` to refresh single-ledger cache state via ledger counting in the associated bucket.
Query Builders: Ledger-Scoped Filtering `internal/storage/ledger/accounts.go`, `internal/storage/ledger/logs.go`, `internal/storage/ledger/transactions.go`	Refactors query construction in `DeleteAccountMetadata`, `ReadLogWithIdempotencyKey`, and transaction methods (`updateTxWithRetrieve`, `RevertTransaction`, `UpdateTransactionMetadata`, `DeleteTransactionMetadata`) to replace hard-coded ledger equality filters with conditional `applyLedgerFilter` or `getLedgerFilterSQL` calls.
Query Builders: Resource Aggregations `internal/storage/ledger/resource_accounts.go`, `internal/storage/ledger/resource_aggregated_balances.go`, `internal/storage/ledger/resource_logs.go`, `internal/storage/ledger/resource_transactions.go`, `internal/storage/ledger/resource_volumes.go`	Systematically replaces explicit ledger filters across accounts, moves, accounts_metadata, and accounts_volumes queries with centralized `applyLedgerFilter` helper. Includes restructuring of innerMostQuery in transactions and address array handling in volumes.

Sequence Diagram(s)

sequenceDiagram
    participant Driver as driver.go
    participant Store as ledger/store.go
    participant System as system/store.go
    participant DB as Database
    
    Driver->>Store: CreateLedger/OpenLedger
    Driver->>Store: UpdateSingleLedgerState(callback)
    Store->>System: CountLedgersInBucket(ctx, bucket)
    System->>DB: SELECT COUNT(*) FROM ledger WHERE bucket = ?
    DB-->>System: count
    System-->>Store: count or error (debug log)
    Store->>Store: Update singleLedgerCache.enabled
    
    Note over Store: Future queries check cache
    Driver->>Store: Query (e.g., ReadLog)
    Store->>Store: isSingleLedger() check
    alt Single ledger optimized
        Store->>DB: Query without ledger filter
    else Multi-ledger
        Store->>DB: Query with ledger filter (applyLedgerFilter)
    end
    DB-->>Store: result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Multiple heterogeneous refactorings: Query builders refactored across 9 files with distinct patterns (conditional filters, mutable query objects, helper function calls, subquery restructuring).
Centralized optimization logic: New caching and filtering helpers introduce state management that must be validated against query semantics across all call sites.
Critical files requiring attention:
- internal/storage/ledger/resource_transactions.go — innerMostQuery restructuring and filtering logic changes require careful trace-through for correctness.
- internal/storage/ledger/resource_volumes.go — Complex query with address array handling and multiple ledger filter applications.
- internal/storage/ledger/store.go — Optimization state machine logic and filter helper implementations must be sound across all contexts.
- internal/storage/ledger/transactions.go — Multiple fallback and update paths with conditional ledger filtering.

Poem

🐰 A cache springs forth, one ledger or more,
Queries now dance by a conditional door,
Where applyLedgerFilter hops through the night,
Filtering swift, optimization in sight—
One store to rule them all, light and tight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The pull request title "fix: query oprimization with where" is related to the main change in the changeset, which involves optimizing query performance by conditionally applying WHERE ledger = ? filters based on single-ledger optimization state. The title refers to a real aspect of the change—query optimization related to WHERE clauses—and does convey meaningful information to a reviewer. However, the title contains a typo ("oprimization" instead of "optimization") and could be more specific about the single-ledger optimization mechanism that is the core of this change. Despite these quality issues, the title communicates the essential nature of the modification.
Description Check	✅ Passed	The pull request description is directly related to the changeset and provides clear, meaningful context about the modification. It explicitly describes the optimization strategy (conditionally skipping the WHERE ledger = ? clause for single-ledger buckets), explains the motivation (reducing unnecessary filtering), and quantifies the expected benefit (5-15% performance improvement). The description accurately reflects the actual changes made across the storage layer files and the new optimization mechanism introduced in store.go.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/main/query-optimization-with-where

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5b2623d and ba0a774.

📒 Files selected for processing (11)

internal/storage/driver/driver.go (2 hunks)
internal/storage/ledger/accounts.go (1 hunks)
internal/storage/ledger/logs.go (1 hunks)
internal/storage/ledger/resource_accounts.go (6 hunks)
internal/storage/ledger/resource_aggregated_balances.go (4 hunks)
internal/storage/ledger/resource_logs.go (1 hunks)
internal/storage/ledger/resource_transactions.go (3 hunks)
internal/storage/ledger/resource_volumes.go (4 hunks)
internal/storage/ledger/store.go (4 hunks)
internal/storage/ledger/transactions.go (4 hunks)
internal/storage/system/store.go (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-04-29T11:24:28.923Z

Learnt from: gfyrag
PR: formancehq/ledger#892
File: internal/controller/ledger/controller_default.go:196-196
Timestamp: 2025-04-29T11:24:28.923Z
Learning: In the ledger Import function, it's critical to maintain proper log ID tracking by updating lastLogID with the current log.ID after each processed log, rather than setting it to nil. This ensures the system can properly validate the ordering of logs and prevent duplicate or out-of-order processing, which is essential for maintaining data integrity in the ledger.

Applied to files:

internal/storage/ledger/logs.go

🧬 Code graph analysis (2)

internal/storage/system/store.go (1)

internal/ledger.go (1)

Ledger (18-26)

internal/storage/ledger/store.go (2)

internal/ledger.go (1)

Ledger (18-26)

internal/storage/system/store.go (1)

Store (20-32)

🔇 Additional comments (9)

internal/storage/system/store.go (1)

65-74: Count query looks solid.

Thanks for adding the bucket-level count with proper error wrapping; this will plug neatly into the optimization hook.

internal/storage/ledger/logs.go (1)

121-129: Centralized ledger scoping looks good.

applyLedgerFilter keeps the idempotency lookup aligned with the single-ledger optimization while preserving the existing limit/idempotency guard. No issues spotted.

internal/storage/ledger/accounts.go (1)

89-99: Update path still respects ledger isolation.

Deferring the ledger predicate through getLedgerFilterSQL keeps multi-ledger buckets safe while letting single-ledger setups bypass the extra WHERE clause. Looks solid.

internal/storage/ledger/resource_aggregated_balances.go (1)

25-104: Ledger filter helper applied consistently.

Every branch (PIT, metadata, partial address) now routes through applyLedgerFilter, so the dataset stays ledger-scoped without redundant predicates. Implementation looks correct.

internal/storage/ledger/resource_transactions.go (1)

33-149: Transactions dataset keeps proper ledger scoping.

The helper-based filtering covers the base query, metadata history, and the effective volume expansion, matching the optimization intent with no functional regressions observed.

internal/storage/ledger/resource_volumes.go (1)

37-125: Volume queries now share the optimized ledger filter.

Both real-time and history code paths pick up applyLedgerFilter, keeping joins in sync and avoiding redundant WHERE clauses. Looks good to ship.

internal/storage/ledger/resource_accounts.go (3)

60-108: Balance resolution refactoring is correct.

The balance resolution logic correctly applies ledger filtering via applyLedgerFilter in both the PIT (line 80, moves) and non-PIT (line 85, accounts_volumes) branches. The removal of the explicit ledger filter from line 69 is appropriate since filtering is now handled consistently through the centralized helper.

28-58: Ledger filter refactoring is correct and consistently applied.

The applyLedgerFilter implementation correctly handles both scenarios:

Single-ledger deployments: returns query unmodified (optimization)

Multi-ledger deployments: applies WHERE tableAlias.ledger = ? filter

The refactoring in resource_accounts.go follows the same pattern used consistently across 8 files (transactions, logs, volumes, aggregated_balances, etc.) with 24+ invocations, all using appropriate table aliases.

114-160: Expand method refactoring is consistent and functional tests already validate this code.

The ledger filtering is correctly applied to both the moves (line 133) and accounts_volumes (line 148) relations. Functional tests in accounts_test.go already cover the Expand method for both "volumes" and "effectiveVolumes" properties, including PIT and non-PIT scenarios.

coderabbitai · 2025-10-29T17:03:57Z

internal/storage/ledger/store.go

+func (store *Store) UpdateSingleLedgerState(ctx context.Context, countFunc func(ctx context.Context, bucketName string) (int, error)) error {
+	count, err := countFunc(ctx, store.ledger.Bucket)
+	if err != nil {
+		return fmt.Errorf("failed to count ledgers in bucket: %w", err)
+	}
+
+	store.singleLedgerCache.mu.Lock()
+	defer store.singleLedgerCache.mu.Unlock()
+	store.singleLedgerCache.enabled = (count == 1)


⚠️ Potential issue | 🔴 Critical

Prevent stale single-ledger cache from leaking writes across ledgers.

Once UpdateSingleLedgerState flips enabled to true, this store stops adding ledger = ?. If another ledger is later added to the same bucket, any long-lived store that already cached enabled=true never revalidates and will continue skipping the filter forever. At that point operations such as UpdateTransactionMetadata, DeleteTransactionMetadata, etc. start touching rows belonging to the newly added ledger—data corruption. Please add an invalidation strategy (e.g., bucket-scoped shared state, versioning, or periodic/conditional re-count) so every store reliably disables the optimization as soon as the bucket stops being single-ledger.

flemzord requested a review from a team as a code owner October 29, 2025 16:55

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: query oprimization with where #1092

fix: query oprimization with where #1092

Uh oh!

flemzord commented Oct 29, 2025

Uh oh!

coderabbitai bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: query oprimization with where #1092

Are you sure you want to change the base?

fix: query oprimization with where #1092

Uh oh!

Conversation

flemzord commented Oct 29, 2025

Uh oh!

coderabbitai bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Oct 29, 2025 •

edited

Loading