⚡ Bolt: Optimize telemetry pipeline and fix state cache regression by heidi-dang · Pull Request #326 · heidi-dang/heidi-engine

heidi-dang · 2026-05-13T10:38:25Z

💡 What: Optimized the telemetry pipeline by implementing a thread-safe cache for pricing configuration and improving event flushing I/O. Also fixed a critical NameError in the state retrieval logic.

🎯 Why: High-frequency operations like token tracking and event flushing were performing redundant disk I/O and JSON parsing, impacting overall pipeline efficiency. Additionally, a regression in the state cache logic was causing application crashes.

📊 Impact:

Resolves a critical crash in get_state.
~2.5x speedup for load_pricing_config lookups (measured 2.4s vs 6.0s for 100k iterations).
Reduced overhead in the telemetry event bus.
Compliance with modern Python (3.12+) datetime standards.

🔬 Measurement:

Verified with pytest tests/test_telemetry_cache.py.
Benchmarked cache impact with a dedicated script.
Linted with ruff check.

PR created automatically by Jules for task 17046132322638879759 started by @heidi-dang

This commit implements several performance improvements and a critical bug fix in the telemetry module: 1. 🐞 **Fix critical NameError**: Resolved a regression in `get_state` where an undefined `target_run_id` caused crashes on cache misses. 2. ⚡ **Pricing Cache**: Implemented a thread-safe module-level cache for `load_pricing_config` with a 5.0s TTL. This eliminates redundant disk I/O and JSON parsing during high-frequency token tracking, yielding a ~2.5x speedup in benchmarks. 3. ⚡ **Flush Optimization**: Optimized `flush_events` by using `f.writelines()` with a generator expression, reducing Python-to-C overhead during disk writes. 4. 🛠️ **Modernized Timestamps**: Replaced deprecated `datetime.utcnow()` with `datetime.now(timezone.utc)` for Python 3.12+ compatibility. Verified with unit tests (`pytest`) and micro-benchmarks. No breaking changes introduced.

google-labs-jules · 2026-05-13T10:38:27Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

gemini-code-assist

Code Review

This pull request introduces several performance optimizations for the telemetry engine, including a thread-safe module-level cache for pricing configurations and the use of writelines for more efficient event logging. It also updates timestamp handling to use timezone-aware datetime objects. A potential race condition in get_run_id() was identified, which should be addressed to ensure consistency across high-frequency calls.

gemini-code-assist · 2026-05-13T10:42:34Z

+
+        # Check for pricing config file
+        pricing_file = (
+            Path(PRICING_CONFIG_PATH) if PRICING_CONFIG_PATH else get_run_dir() / "pricing.json"


The call to get_run_dir() eventually invokes get_run_id(), which has a race condition when initializing the global RUN_ID (lines 440-446). While load_pricing_config is protected by _pricing_lock, other high-frequency functions like emit_event call get_run_id() without this lock. This could lead to multiple threads generating different run IDs if they hit the initialization path simultaneously. Given the focus on thread-safety in this PR, get_run_id() should be updated to use a lock for its initialization logic.

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Bolt: Optimize telemetry pipeline and fix state cache regression#326

⚡ Bolt: Optimize telemetry pipeline and fix state cache regression#326
heidi-dang wants to merge 1 commit into
feat/bootstrap-scaffoldfrom
bolt-telemetry-optimization-v2-17046132322638879759

heidi-dang commented May 13, 2026

Uh oh!

google-labs-jules Bot commented May 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

heidi-dang commented May 13, 2026

Uh oh!

google-labs-jules Bot commented May 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant