Summary
In v0.3.9rc1 with cache.enabled: true and cache.hot_cache_only: true, the runtime correctly honors the hot-cache-only mode (no SSD writes, no queue full warnings, ~/.omlx/cache stays empty) — but the scheduler's cache-init log still emits the same paged SSD cache enabled: cache_dir=… line as a full-SSD-cache configuration. The log is misleading; operators verifying that PR #864 took effect can't tell from the log alone.
Environment
- oMLX:
0.3.9rc1 (DMG build, macOS 26 Tahoe)
- Hardware: Apple M2 Pro, 16 GB unified
- Settings (
~/.omlx/settings.json):
"cache": {
"enabled": true,
"hot_cache_only": true,
"ssd_cache_dir": "/Users/.../.omlx/cache",
"ssd_cache_max_size": "46GB",
"hot_cache_max_size": "2GB",
"initial_cache_blocks": 256
}
Observed log lines on first model load after server restart
omlx.cache.paged_cache - INFO - PagedCacheManager initialized: block_size=256, initial_blocks=256, max_blocks=100000, max_tokens=25600000
omlx.cache.paged_ssd_cache - INFO - PagedSSDCacheManager initialized: dir=/Users/.../.omlx/cache, max_size=46.00 GB, hot_cache=2.00 GB, existing_files=0, disk_free=60.00 GB, cache_used=0 B
omlx.cache.paged_cache - INFO - paged SSD cache manager connected to PagedCacheManager
omlx.cache.prefix_cache - INFO - PagedSSDCacheManager connected to BlockAwarePrefixCache
omlx.scheduler - INFO - paged SSD cache enabled: cache_dir=/Users/.../.omlx/cache, max_size=46.00 GB, block_size=256 tokens
omlx.scheduler - INFO - paged SSD cache enabled: /Users/.../.omlx/cache, block_size=256, max_blocks=100000
The two paged SSD cache enabled lines from omlx.scheduler are identical whether hot_cache_only is true or false.
Runtime behavior is actually correct
To confirm the fix from PR #864 is doing its job at runtime:
- After 2-pass benchmark (same 2K-token prefix, different user message): pass-2 was 4.4× faster than pass-1 → hot cache hits are working.
du -sh ~/.omlx/cache stays at 0 B across the session — no SSD writes.
grep "queue full" ~/.omlx/logs/server.log after the test session: zero matches.
So the runtime correctly skips the SSD layer; only the init log text is stale.
Suggested fix
Branch the scheduler log on cache.hot_cache_only:
if hot_cache_only:
log.info("hot-cache-only mode enabled: hot_cache=%s, block_size=%d tokens",
hot_cache_size_human, block_size)
else:
log.info("paged SSD cache enabled: cache_dir=%s, max_size=%s, block_size=%d tokens",
cache_dir, max_size_human, block_size)
Same treatment for the second duplicate line a few lines below in the same scheduler init path.
Why this matters
Happy to send a small PR if useful.
Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/
Summary
In
v0.3.9rc1withcache.enabled: trueandcache.hot_cache_only: true, the runtime correctly honors the hot-cache-only mode (no SSD writes, noqueue fullwarnings,~/.omlx/cachestays empty) — but the scheduler's cache-init log still emits the samepaged SSD cache enabled: cache_dir=…line as a full-SSD-cache configuration. The log is misleading; operators verifying that PR #864 took effect can't tell from the log alone.Environment
0.3.9rc1(DMG build, macOS 26 Tahoe)~/.omlx/settings.json):Observed log lines on first model load after server restart
The two
paged SSD cache enabledlines fromomlx.schedulerare identical whetherhot_cache_onlyis true or false.Runtime behavior is actually correct
To confirm the fix from PR #864 is doing its job at runtime:
du -sh ~/.omlx/cachestays at0 Bacross the session — no SSD writes.grep "queue full" ~/.omlx/logs/server.logafter the test session: zero matches.So the runtime correctly skips the SSD layer; only the init log text is stale.
Suggested fix
Branch the scheduler log on
cache.hot_cache_only:Same treatment for the second duplicate line a few lines below in the same scheduler init path.
Why this matters
hot_cache_only: truewas silently ignored, see [Feature Request] Enable 'Hot Cache' only (Disable 'Cold Cache') #605 / PR fix(cache): hot_cache_only - fix cache init bypass and Metal memory leak #864) check this log to confirm the flag is now honored. The stale text reads as "still broken" even when behavior is correct.Happy to send a small PR if useful.
Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/