Skip to content

Optimize subtree validation for high-throughput block processing#378

Draft
freemans13 wants to merge 18 commits intobsv-blockchain:mainfrom
freemans13:stu/subtree-optimizations
Draft

Optimize subtree validation for high-throughput block processing#378
freemans13 wants to merge 18 commits intobsv-blockchain:mainfrom
freemans13:stu/subtree-optimizations

Conversation

@freemans13
Copy link
Collaborator

@freemans13 freemans13 commented Jan 12, 2026

Summary

This PR introduces comprehensive performance optimizations to the subtree validation and transaction validation pipeline, focusing on:

  • Reducing I/O latency through parallelization (especially for NFS-backed storage)
  • Enabling parallel execution of validation and storage phases
  • Reducing CPU and memory allocations in hot paths
  • Decoupling CPU-bound validation from I/O-bound storage operations

These changes are critical for high-throughput block processing scenarios where blocks contain thousands of subtrees and millions of transactions.


Problem Statement

The current implementation has several performance bottlenecks:

  1. Sequential I/O operations: Subtree existence checks performed sequentially, causing cumulative latency on NFS
  2. Redundant storage calls: Multiple existence checks for the same subtree
  3. Tight coupling: Script validation (CPU) and UTXO operations (I/O) cannot overlap
  4. Excessive allocations: Validator options recreated in every loop iteration
  5. UTXO lookups for in-block parents: Level N validation requires UTXO store lookups even when parents from Level N-1 are still being processed

Key Changes

1. Parallelized Storage Existence Checks

Location: services/subtreevalidation/check_block_subtrees.go

Before: Sequential I/O calls

for _, subtreeHash := range block.Subtrees {
    exists, err := u.subtreeStore.Exists(ctx, subtreeHash[:], fileformat.FileTypeSubtree)
    // Each call blocks the next - cumulative latency
}

After: Parallel I/O with preserved ordering

existsGroup, existsCtx := errgroup.WithContext(ctx)
util.SafeSetLimit(existsGroup, u.settings.SubtreeValidation.CheckBlockSubtreesConcurrency)

for idx, subtreeHash := range block.Subtrees {
    existsGroup.Go(func() error {
        subtreeExists, err := u.subtreeStore.Exists(existsCtx, subtreeHash[:], fileformat.FileTypeSubtree)
        subtreeExistsFlags[idx] = subtreeExists  // Preserve order
        return err
    })
}
existsGroup.Wait()

Impact: ~100x faster for 10,000 subtrees on NFS (100s → 1s)

2. Pre-Check File Type Existence

Location: services/subtreevalidation/check_block_subtrees.go

Checks both FileTypeSubtreeToCheck and FileTypeSubtreeData in parallel upfront:

type subtreeFileExistence struct {
    toCheckExists bool
    dataExists    bool
}
existenceMap := make(map[int]*subtreeFileExistence)

// Parallel pre-check of both file types
existenceGroup.Go(func() error {
    toCheckExists, err := u.subtreeStore.Exists(existenceCtx, subtreeHash[:], fileformat.FileTypeSubtreeToCheck)
    dataExists, err := u.subtreeStore.Exists(existenceCtx, subtreeHash[:], fileformat.FileTypeSubtreeData)
    existenceMap[idx] = &subtreeFileExistence{toCheckExists, dataExists}
    return nil
})

Impact: Reduces I/O operations by ~50% per subtree

3. Validation Pipeline Decoupling

Location: services/validator/Validator.go, services/validator/options.go

Added new options to separate CPU-bound and I/O-bound operations:

type Options struct {
    // Skip UTXO spending during validation phase (CPU-only validation)
    SkipUtxoSpend bool
    
    // Skip script validation during storage phase (I/O-only storage)
    SkipValidation bool
    
    // Pre-fetched metadata for in-block parent transactions
    ParentMetadata map[chainhash.Hash]*ParentTxMetadata
}

Usage Pattern:

// Phase 1: CPU-intensive validation (parallel across Level N transactions)
validationOptions := validator.ProcessOptions(
    validator.WithSkipUtxoCreate(true),
    validator.WithSkipUtxoSpend(true),  // Skip I/O during validation
)

// Phase 2: I/O-intensive storage (parallel, but waits for Level N-1 to complete)
storageOptions := &validator.Options{
    SkipValidation: true,  // Skip CPU during storage
    SkipUtxoCreate: false,
    SkipUtxoSpend:  false,
}

Impact: Enables overlap of validation (Level N) with storage (Level N-1)

4. Parent Transaction Metadata Caching

Location: services/validator/Validator.go

Validator checks metadata map before UTXO store lookup:

// Check if parent metadata is provided (for in-block parents)
if validationOptions.ParentMetadata != nil {
    if parentMeta, found := validationOptions.ParentMetadata[parentTxHash]; found {
        // Use pre-fetched metadata - no UTXO store call needed
        for _, idx := range idxs {
            utxoHeights[idx] = parentMeta.BlockHeight
        }
        return nil
    }
}
// Fall back to UTXO store for external parents

Impact: Eliminates UTXO lookups for in-block parents, enables pipeline overlap

5. Reduced Allocations in Hot Paths

Location: services/subtreevalidation/check_block_subtrees.go

Before: Options created in every loop iteration

for batchStart := 0; batchStart < totalSubtrees; batchStart += subtreesBatchSize {
    // Recreated every batch (100+ times for large blocks)
    validatorOptions := []validator.Option{
        validator.WithSkipPolicyChecks(true),
        validator.WithCreateConflicting(true),
        validator.WithIgnoreLocked(true),
    }
    processedOpts := validator.ProcessOptions(validatorOptions...)
}

for level := 0; level <= maxLevel; level++ {
    // Recreated every level (3-10 times per batch)
    processedValidationOptions := validator.ProcessOptions(validationOnlyOptions...)
    storageOptions := &validator.Options{...}
    validationWorkers := u.settings.SubtreeValidation.CheckBlockSubtreesConcurrency
    storageWorkers := u.settings.SubtreeValidation.SpendBatcherSize * 6
}

After: Created once, reused across iterations

// Created once before batch loop
batchValidatorOptions := []validator.Option{...}
batchProcessedOpts := validator.ProcessOptions(batchValidatorOptions...)

// Created once before level loop
baseValidationOptions := validator.ProcessOptions(validationOnlyOptions...)
storageOptions := &validator.Options{...}
validationWorkers := u.settings.SubtreeValidation.CheckBlockSubtreesConcurrency
storageWorkers := u.settings.SubtreeValidation.SpendBatcherSize * 6

for level := 0; level <= maxLevel; level++ {
    // Only update ParentMetadata per level
    baseValidationOptions.ParentMetadata = buildParentMetadata(txsPerLevel[level-1], blockHeight)
}

Impact:

  • Batch loop: 100+ allocations → 1 allocation
  • Level loop: 3-10 allocations → 1 allocation per type
  • Reduces GC pressure and CPU overhead in hot paths

6. Renamed Options for Clarity

Location: services/validator/options.go

  • SkipUtxoCreationSkipUtxoCreate (consistency with SkipUtxoSpend)
  • SkipScriptValidationSkipValidation (broader scope, includes signatures)

Performance Impact

Parallelization Gains

Example: 10,000 subtrees on NFS (10ms latency per call, concurrency=100)

  • Before: 10,000 sequential calls = ~100 seconds
  • After: 10,000 parallel calls = ~1 second
  • Improvement: ~100x faster

Pipeline Overlap

Example: 3 dependency levels, each taking 10s validation + 5s storage

  • Before: Sequential execution = (10+5) × 3 = 45 seconds
  • After: Overlapped execution = 10 + (10+5) + (10+5) + 5 = 35 seconds
  • Improvement: ~30% faster (increases with more levels)

Allocation Reduction

  • Batch processing: 100+ option allocations → 1
  • Level processing: 3-10 allocations per type → 1 per type
  • Reduces GC overhead and CPU usage in hot paths

Backward Compatibility

Fully backward compatible:

  • New validation options default to existing behavior
  • ParentMetadata is optional (nil-checked before use)
  • Option renames maintain same functionality
  • All existing code paths work unchanged

Testing Status

  • ✅ All existing tests pass
  • ✅ Build passes successfully
  • ✅ Linter passes with no issues
  • ✅ Updated test signatures to match new method signatures

Files Changed

Core Changes

  • services/subtreevalidation/check_block_subtrees.go - Parallelization, pre-checks, allocation reduction
  • services/validator/Validator.go - Pipeline decoupling, metadata caching
  • services/validator/options.go - New options for validation phases
  • services/validator/Client.go - Updated method signatures

Supporting Changes

  • services/propagation/Server.go - Option rename
  • services/legacy/netsync/handle_block.go - Option rename
  • docs/references/kafkaMessageFormat.md - Documentation updates
  • util/kafka/kafka_message/kafka_messages.proto - Protocol updates

Test Updates

  • services/validator/*_test.go - Updated for new signatures
  • services/subtreevalidation/check_block_subtrees_test.go - Updated tests

🤖 Generated with Claude Code

This commit introduces significant performance optimizations to the subtree validation pipeline, focusing on reducing I/O latency (especially on NFS) and enabling parallel validation phases.

## Key Optimizations

### 1. Parallelized Storage Existence Checks
- Replaced sequential subtree existence checks with parallel errgroup execution
- Critical for NFS-backed storage where each check incurs network latency
- Preserves original subtree order using boolean flag arrays (required for transaction dependencies)

### 2. Pre-Checking File Type Existence
- Added upfront parallel checks for both FileTypeSubtreeToCheck and FileTypeSubtreeData
- Eliminates redundant storage calls during validation pipeline
- Reduces total I/O operations by ~50% per subtree

### 3. Validator Pipeline Decoupling
- Added SkipUtxoStoreSpending option for CPU-only validation mode
- Added SkipScriptValidation option for I/O-only storage mode
- Enables parallel execution of validation (CPU-bound) and storage (I/O-bound) phases

### 4. Parent Transaction Metadata Caching
- New ParentMetadata option in validator allows pre-fetching parent tx metadata
- Enables validation of Level N transactions while Level N-1 is still storing
- Eliminates UTXO store lookups for in-block parent transactions

### 5. Coinbase Transaction Nil-Safety Fix
- Changed Block.go to always initialize coinbase tx pointer
- Prevents nil pointer dereferences during validation of blocks with empty coinbase

## Performance Impact

These optimizations significantly improve block validation throughput by:
- Reducing I/O latency on NFS-backed blob stores
- Enabling concurrent execution of previously sequential operations
- Eliminating redundant storage operations
- Allowing validation and storage phases to overlap

Particularly impactful for blocks with large subtree counts where storage latency dominates processing time.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <[email protected]>
@freemans13 freemans13 self-assigned this Jan 12, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 12, 2026

🤖 Claude Code Review

Status: Complete

Current Review:
Two issues found requiring attention:

  1. CRITICAL: Early return bug on line 806 - Uses return nil instead of continue when a transaction is already validated. This causes the entire validation loop to exit prematurely, preventing subsequent transactions in that level from being validated.

  2. Duplicate error check on line 912 - Line 907 checks errors.Is(storeErr, errors.ErrTxExists) and returns early, making the same check on line 912 redundant dead code. Line 912 should only check errors.ErrTxConflicting.

Minor observations:

  • Hardcoded storage worker multiplier (line 871) - The 6x multiplier is empirically chosen but could benefit from being configurable.

History:

  • ✅ Fixed: Storage phase now properly passes validator options
  • ✅ Resolved: .orig file removed from PR
  • ✅ Fixed: Error variable scope corrected

newStorageGroup.Go(func() error {
// Store: Direct UTXO operations (Spend + Create)
// Spend parent UTXOs
_, spendErr := u.utxoStore.Spend(sCtx, tx, blockHeight, utxo.IgnoreFlags{
Copy link
Contributor

@github-actions github-actions bot Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed - storage phase now properly passes validator options (lines 883-892) including CreateConflicting, IgnoreConflicting, and IgnoreLocked.

newStorageGroup, sCtx := errgroup.WithContext(ctx)
// I/O-bound: Use higher multiplier for network latency tolerance
// SpendBatcherSize controls batch size; multiply by 6 for I/O concurrency
storageWorkers := u.settings.SubtreeValidation.SpendBatcherSize * 6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 6x multiplier for storage workers appears to be an empirically chosen value. Consider making this configurable in settings to allow operators to tune based on their specific Aerospike cluster and network characteristics.

For example: settings.SubtreeValidation.StorageWorkerMultiplier (default 6) could provide flexibility for different deployment scenarios.

@@ -0,0 +1,980 @@
/*
Copy link
Contributor

@github-actions github-actions bot Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Resolved - The .orig file has been removed from the PR.

_, storeErr := u.validatorClient.ValidateWithOptions(sCtx, tx, blockHeight, storageOptions)
if storeErr != nil {
// TX_EXISTS is not an error - transaction was already validated
if errors.Is(err, errors.ErrTxExists) {
Copy link
Contributor

@github-actions github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed - Line 907 correctly checks errors.Is(storeErr, errors.ErrTxExists) now.

return nil
}

if errors.Is(storeErr, errors.ErrTxExists) || errors.Is(storeErr, errors.ErrTxConflicting) {
Copy link
Contributor

@github-actions github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug still present: Line 907 checks errors.Is(storeErr, errors.ErrTxExists) and returns. Line 912 checks it again (redundant dead code). Line 912 should only check ErrTxConflicting, not ErrTxExists.

@sonarqubecloud
Copy link

@freemans13 freemans13 marked this pull request as draft January 23, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant