-
Notifications
You must be signed in to change notification settings - Fork 130
fix: query oprimization with where #1092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Optimize query performance for single-ledger buckets by conditionally
skipping the WHERE ledger = ? clause when a bucket contains only one
ledger. This reduces unnecessary filtering and can provide 5-15%
performance improvement in single-ledger deployments.
Implementation:
- Add singleLedgerOptimization cache to ledger Store
- Add CountLedgersInBucket to system store
- Detect single-ledger state on CreateLedger and OpenLedger
- Refactor all query builders to use conditional filtering
Changes:
- internal/storage/ledger/store.go: Add cache and helper methods
- internal/storage/system/store.go: Add CountLedgersInBucket
- internal/storage/driver/driver.go: Detect single-ledger state
- internal/storage/ledger/resource_*.go: Apply conditional filtering
- internal/storage/ledger/{accounts,logs,transactions}.go: Apply conditional filtering
WalkthroughThe pull request introduces a single-ledger optimization mechanism. Changes refactor ledger filtering across multiple query builders to conditionally apply filters based on cached ledger state, add infrastructure to track and update that state, and integrate trigger calls in driver initialization. Changes
Sequence Diagram(s)sequenceDiagram
participant Driver as driver.go
participant Store as ledger/store.go
participant System as system/store.go
participant DB as Database
Driver->>Store: CreateLedger/OpenLedger
Driver->>Store: UpdateSingleLedgerState(callback)
Store->>System: CountLedgersInBucket(ctx, bucket)
System->>DB: SELECT COUNT(*) FROM ledger WHERE bucket = ?
DB-->>System: count
System-->>Store: count or error (debug log)
Store->>Store: Update singleLedgerCache.enabled
Note over Store: Future queries check cache
Driver->>Store: Query (e.g., ReadLog)
Store->>Store: isSingleLedger() check
alt Single ledger optimized
Store->>DB: Query without ledger filter
else Multi-ledger
Store->>DB: Query with ledger filter (applyLedgerFilter)
end
DB-->>Store: result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (11)
internal/storage/driver/driver.go(2 hunks)internal/storage/ledger/accounts.go(1 hunks)internal/storage/ledger/logs.go(1 hunks)internal/storage/ledger/resource_accounts.go(6 hunks)internal/storage/ledger/resource_aggregated_balances.go(4 hunks)internal/storage/ledger/resource_logs.go(1 hunks)internal/storage/ledger/resource_transactions.go(3 hunks)internal/storage/ledger/resource_volumes.go(4 hunks)internal/storage/ledger/store.go(4 hunks)internal/storage/ledger/transactions.go(4 hunks)internal/storage/system/store.go(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-04-29T11:24:28.923Z
Learnt from: gfyrag
PR: formancehq/ledger#892
File: internal/controller/ledger/controller_default.go:196-196
Timestamp: 2025-04-29T11:24:28.923Z
Learning: In the ledger Import function, it's critical to maintain proper log ID tracking by updating lastLogID with the current log.ID after each processed log, rather than setting it to nil. This ensures the system can properly validate the ordering of logs and prevent duplicate or out-of-order processing, which is essential for maintaining data integrity in the ledger.
Applied to files:
internal/storage/ledger/logs.go
🧬 Code graph analysis (2)
internal/storage/system/store.go (1)
internal/ledger.go (1)
Ledger(18-26)
internal/storage/ledger/store.go (2)
internal/ledger.go (1)
Ledger(18-26)internal/storage/system/store.go (1)
Store(20-32)
🔇 Additional comments (9)
internal/storage/system/store.go (1)
65-74: Count query looks solid.Thanks for adding the bucket-level count with proper error wrapping; this will plug neatly into the optimization hook.
internal/storage/ledger/logs.go (1)
121-129: Centralized ledger scoping looks good.
applyLedgerFilterkeeps the idempotency lookup aligned with the single-ledger optimization while preserving the existing limit/idempotency guard. No issues spotted.internal/storage/ledger/accounts.go (1)
89-99: Update path still respects ledger isolation.Deferring the ledger predicate through
getLedgerFilterSQLkeeps multi-ledger buckets safe while letting single-ledger setups bypass the extra WHERE clause. Looks solid.internal/storage/ledger/resource_aggregated_balances.go (1)
25-104: Ledger filter helper applied consistently.Every branch (PIT, metadata, partial address) now routes through
applyLedgerFilter, so the dataset stays ledger-scoped without redundant predicates. Implementation looks correct.internal/storage/ledger/resource_transactions.go (1)
33-149: Transactions dataset keeps proper ledger scoping.The helper-based filtering covers the base query, metadata history, and the effective volume expansion, matching the optimization intent with no functional regressions observed.
internal/storage/ledger/resource_volumes.go (1)
37-125: Volume queries now share the optimized ledger filter.Both real-time and history code paths pick up
applyLedgerFilter, keeping joins in sync and avoiding redundant WHERE clauses. Looks good to ship.internal/storage/ledger/resource_accounts.go (3)
60-108: Balance resolution refactoring is correct.The balance resolution logic correctly applies ledger filtering via
applyLedgerFilterin both the PIT (line 80, moves) and non-PIT (line 85, accounts_volumes) branches. The removal of the explicit ledger filter from line 69 is appropriate since filtering is now handled consistently through the centralized helper.
28-58: Ledger filter refactoring is correct and consistently applied.The
applyLedgerFilterimplementation correctly handles both scenarios:
- Single-ledger deployments: returns query unmodified (optimization)
- Multi-ledger deployments: applies
WHERE tableAlias.ledger = ?filterThe refactoring in
resource_accounts.gofollows the same pattern used consistently across 8 files (transactions, logs, volumes, aggregated_balances, etc.) with 24+ invocations, all using appropriate table aliases.
114-160: Expand method refactoring is consistent and functional tests already validate this code.The ledger filtering is correctly applied to both the moves (line 133) and accounts_volumes (line 148) relations. Functional tests in accounts_test.go already cover the Expand method for both "volumes" and "effectiveVolumes" properties, including PIT and non-PIT scenarios.
| func (store *Store) UpdateSingleLedgerState(ctx context.Context, countFunc func(ctx context.Context, bucketName string) (int, error)) error { | ||
| count, err := countFunc(ctx, store.ledger.Bucket) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to count ledgers in bucket: %w", err) | ||
| } | ||
|
|
||
| store.singleLedgerCache.mu.Lock() | ||
| defer store.singleLedgerCache.mu.Unlock() | ||
| store.singleLedgerCache.enabled = (count == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent stale single-ledger cache from leaking writes across ledgers.
Once UpdateSingleLedgerState flips enabled to true, this store stops adding ledger = ?. If another ledger is later added to the same bucket, any long-lived store that already cached enabled=true never revalidates and will continue skipping the filter forever. At that point operations such as UpdateTransactionMetadata, DeleteTransactionMetadata, etc. start touching rows belonging to the newly added ledger—data corruption. Please add an invalidation strategy (e.g., bucket-scoped shared state, versioning, or periodic/conditional re-count) so every store reliably disables the optimization as soon as the bucket stops being single-ledger.
Optimize query performance for single-ledger buckets by conditionally skipping the WHERE ledger = ? clause when a bucket contains only one ledger. This reduces unnecessary filtering and can provide 5-15% performance improvement in single-ledger deployments.