Skip to content

feat(scheduler): split store_cache_main_prep into sub-phase timers#1243

Merged
jundot merged 1 commit into
jundot:mainfrom
ivaniguarans:feat/store-cache-phase-timer-granularity
May 14, 2026
Merged

feat(scheduler): split store_cache_main_prep into sub-phase timers#1243
jundot merged 1 commit into
jundot:mainfrom
ivaniguarans:feat/store-cache-phase-timer-granularity

Conversation

@ivaniguarans
Copy link
Copy Markdown
Contributor

Summary

store_cache_main_prep wraps three distinct operations as a single timing measurement — boundary override lookup, KV array collection, and async_eval dispatch. When diagnosing store-cache overhead across model architectures (standard transformer vs GatedDeltaNet/ArraysCache vs MoE), the aggregate doesn't reveal which sub-step dominates. On ArraysCache models, for example, collect and dispatch are both near-zero (empty pre_eval_arrays), but the single timer can't distinguish that from a model where boundary merge dominates.

Changes

Split into three sub-phase timers following the boundary_capture_sync / boundary_capture_extract / boundary_snapshot_save pattern already in use:

  • store_cache_main_boundary — boundary override lookup + optional merge with full cache
  • store_cache_main_collect_collect_arrays_from_extracted_cache traversal
  • store_cache_main_dispatchmx.async_eval enqueue

The mx.stream(generation_stream) context remains the outer wrapper. No behavioral change — only diagnostic granularity.

Test plan

Replace the aggregate store_cache_main_prep timer with three
sub-phase timers (boundary / collect / dispatch) mirroring the
boundary_capture_* granularity already in use. Helps isolate which
store-cache prep sub-step dominates across model architectures.
@jundot
Copy link
Copy Markdown
Owner

jundot commented May 14, 2026

Thanks for this. The split matches the existing boundary_capture_* timers and is directly useful for the paged SSD store-cache perf work. Confirmed no other references to store_cache_main_prep and the phase-stats log iterates generically, so the new names surface on their own. Merging for the next release.

@jundot jundot merged commit 7fab13b into jundot:main May 14, 2026
@ivaniguarans ivaniguarans deleted the feat/store-cache-phase-timer-granularity branch May 19, 2026 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants