This guide covers performance optimization strategies for Nebula, including caching, connection pooling, load balancing, and resource tuning.
- Caching Strategies
- Connection Pooling
- Load Balancing
- Memory Optimization
- CPU Optimization
- Network Optimization
- Database Performance
- Monitoring and Profiling
- Benchmarking
Use the built-in LRU cache for frequently accessed data:
use nebula_production::{CacheConfig, InMemoryCache, EvictionPolicy};
let config = CacheConfig::new()
.with_max_size(10_000) // Maximum entries
.with_ttl_seconds(3600) // 1 hour default TTL
.with_eviction_policy(EvictionPolicy::Lru);
let cache = InMemoryCache::new(config);
// Store data
cache.set("user:123", user_data, Some(1800))?; // 30 minute TTL
// Retrieve data
if let Some(user) = cache.get("user:123")? {
// Use cached data
}| Parameter | Description | Recommended Value |
|---|---|---|
max_size |
Maximum cache entries | 10,000 - 100,000 |
ttl_seconds |
Default time-to-live | 300 - 3600 |
eviction_policy |
LRU, LFU, or TTL | LRU for most cases |
- Time-based (TTL): Automatically expire entries after a set time
- Write-through: Update cache when data is written
- Write-behind: Async cache updates for better write performance
- Invalidation events: Explicitly invalidate on data changes
For distributed caching across multiple instances:
# Enable Redis caching
NEBULA_CACHE_ENABLED=true
NEBULA_REDIS_URL=redis://localhost:6379
NEBULA_CACHE_TTL_SECS=3600Pre-populate cache for predictable load patterns:
async fn warm_cache(cache: &InMemoryCache) -> Result<(), CacheError> {
// Load frequently accessed data
let popular_users = fetch_popular_users().await?;
for user in popular_users {
cache.set(format!("user:{}", user.id), user, Some(3600))?;
}
Ok(())
}Configure connection pooling for optimal database performance:
use nebula_production::{PoolConfig, ConnectionPool};
let config = PoolConfig::new()
.with_min_connections(5) // Minimum idle connections
.with_max_connections(20) // Maximum connections
.with_timeout_seconds(30) // Connection timeout
.with_idle_timeout_seconds(300) // Close idle after 5 min
.with_health_check_interval(60); // Health check every 60s
let pool = ConnectionPool::new(config, factory).await?;| Workload | Min Connections | Max Connections |
|---|---|---|
| Low traffic | 2 | 10 |
| Medium traffic | 5 | 25 |
| High traffic | 10 | 50 |
| Batch processing | 1 | 100 |
Reuse HTTP connections for external API calls:
use nebula_production::{PoolConfig, ConnectionFactory};
struct HttpClientFactory;
impl ConnectionFactory for HttpClientFactory {
type Connection = reqwest::Client;
type Error = reqwest::Error;
async fn create(&self) -> Result<Self::Connection, Self::Error> {
reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(30))
.build()
}
async fn is_valid(&self, conn: &Self::Connection) -> bool {
true // reqwest clients are always valid
}
}use nebula_production::{LoadBalancingStrategy, PerformanceConfig};
let config = PerformanceConfig::new()
.with_load_balancing(LoadBalancingStrategy::LeastConnections);| Strategy | Use Case |
|---|---|
| RoundRobin | Uniform request distribution |
| LeastConnections | Variable request duration |
| Weighted | Heterogeneous server capacity |
| Random | Simple distribution with some variance |
Scale horizontally by adding more instances:
# Kubernetes HPA configuration
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nebula-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nebula
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Use efficient data structures for large datasets:
// Use Box for large types to reduce stack usage
let large_data = Box::new(expensive_struct);
// Use Arc for shared ownership without cloning
let shared_data = Arc::new(data);
// Use Cow for copy-on-write semantics
let data: Cow<str> = Cow::Borrowed("initial");Set memory limits to prevent OOM:
# Kubernetes memory limits
resources:
limits:
memory: "512Mi"
requests:
memory: "256Mi"Rust has no GC, but optimize memory allocation:
- Pre-allocate vectors:
Vec::with_capacity(n) - Use iterators: Avoid intermediate collections
- Reuse buffers: Use
clear()instead of reallocating - Avoid unnecessary cloning: Use references and
Copytypes
Optimize Tokio runtime settings:
use tokio::runtime::Builder;
let runtime = Builder::new_multi_thread()
.worker_threads(num_cpus::get()) // One thread per CPU core
.max_blocking_threads(512) // For blocking operations
.thread_stack_size(2 * 1024 * 1024) // 2MB stack
.enable_all()
.build()?;Use parallel processing for CPU-intensive tasks:
use rayon::prelude::*;
let results: Vec<_> = data.par_iter()
.map(|item| process_item(item))
.collect();Enable release optimizations:
[profile.release]
lto = true # Link-time optimization
codegen-units = 1 # Single codegen unit for better optimization
opt-level = 3 # Maximum optimization
strip = true # Strip debug symbolsEnable connection reuse:
// Axum/Tower configuration
let app = Router::new()
// ... routes
.layer(
tower_http::trace::TraceLayer::new_for_http()
.make_span_with(DefaultMakeSpan::default())
.on_response(DefaultOnResponse::new()),
);Enable response compression:
use tower_http::compression::CompressionLayer;
let app = Router::new()
// ... routes
.layer(CompressionLayer::new()
.gzip(true)
.deflate(true)
.br(true));Set appropriate timeouts:
use std::time::Duration;
let client = reqwest::Client::builder()
.connect_timeout(Duration::from_secs(5))
.timeout(Duration::from_secs(30))
.pool_idle_timeout(Duration::from_secs(90))
.pool_max_idle_per_host(10)
.build()?;- Use indexes: Add indexes for frequently queried columns
- Select only needed columns: Avoid
SELECT * - Use prepared statements: Prevent SQL injection and improve performance
- Batch operations: Group multiple operations together
# Environment variables
NEBULA_POOL_MIN_CONNECTIONS=5
NEBULA_POOL_MAX_CONNECTIONS=20
NEBULA_POOL_TIMEOUT_SECS=30
NEBULA_POOL_IDLE_TIMEOUT_SECS=300For read-heavy workloads, use read replicas:
// Route reads to replicas, writes to primary
let db = if is_read_operation {
replica_pool.get().await?
} else {
primary_pool.get().await?
};Enable Prometheus metrics:
NEBULA_METRICS_ENABLED=trueKey metrics to monitor:
| Metric | Description | Alert Threshold |
|---|---|---|
http_requests_total |
Total HTTP requests | - |
http_request_duration_seconds |
Request latency | p99 > 1s |
active_connections |
Current connections | > 80% of max |
cache_hits_total |
Cache hit count | - |
cache_misses_total |
Cache miss count | Miss rate > 20% |
pool_available_connections |
Available pool connections | < 20% of max |
errors_total |
Error count | Error rate > 1% |
Enable tracing for performance analysis:
OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317
OTEL_SERVICE_NAME=nebula
OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1 # Sample 10% of requestsUse profiling tools to identify bottlenecks:
# CPU profiling with perf
perf record -F 99 -p <pid> -- sleep 30
perf report
# Memory profiling with jemalloc
MALLOC_CONF="prof:true,prof_active:true,lg_prof_sample:19" ./nebula-production
# Flamegraph generation
cargo flamegraph --bin nebula-productionUse tools like wrk or vegeta for load testing:
# Install wrk
apt-get install wrk
# Run benchmark
wrk -t12 -c400 -d30s http://localhost:8080/api/endpoint# Example benchmark configuration
target: "http://localhost:8080"
concurrency: 100
duration: 60s
ramp_up: 10sEstablish performance baselines:
| Metric | Target |
|---|---|
| p50 latency | < 50ms |
| p95 latency | < 200ms |
| p99 latency | < 500ms |
| Error rate | < 0.1% |
| Throughput | > 1000 req/s |
Before production deployment:
- Caching configured with appropriate TTL
- Connection pool sized correctly
- Load balancing strategy selected
- Compression enabled
- Timeouts configured
- Metrics collection enabled
- Tracing configured
- Release build optimizations enabled
- Load testing completed
- Performance baselines established
Causes:
- Slow database queries
- Network latency
- Insufficient resources
- Lock contention
Solutions:
- Add database indexes
- Enable caching
- Scale horizontally
- Optimize critical paths
Causes:
- Memory leaks
- Large cache size
- Inefficient data structures
- Insufficient garbage collection (not applicable to Rust)
Solutions:
- Profile memory usage
- Reduce cache size
- Use efficient data structures
- Set memory limits
Causes:
- Inefficient algorithms
- Lack of parallelism
- Excessive logging
- Hot loops
Solutions:
- Profile CPU usage
- Use parallel processing
- Reduce log verbosity
- Optimize hot paths
Causes:
- Pool too small
- Connection leaks
- Long-running queries
- High concurrency
Solutions:
- Increase pool size
- Fix connection leaks
- Optimize queries
- Implement queuing