Context
SPEC-03 describes caching baseline evaluations per commit hash as eval_baseline Cortex nodes. This avoids re-running baselines when the commit hasn't changed.
Current Behavior
EvaluationRunner.capture_baseline() runs all evaluation commands fresh every time. Baselines are held in memory only and not persisted to Cortex.
Expected Behavior
Before capturing a baseline, check Cortex for an existing eval_baseline node matching the current commit hash. If found, reuse it. After capturing, persist to Cortex for future reuse.
Impact
- Redundant test/lint/build runs on unchanged code
- Slower experiment cycles
- Not a correctness issue
References
atlas-specs/03-EVALUATION.md — eval_baseline caching strategy
atlas/evaluation/runner.py — capture_baseline() always runs fresh
Context
SPEC-03 describes caching baseline evaluations per commit hash as
eval_baselineCortex nodes. This avoids re-running baselines when the commit hasn't changed.Current Behavior
EvaluationRunner.capture_baseline()runs all evaluation commands fresh every time. Baselines are held in memory only and not persisted to Cortex.Expected Behavior
Before capturing a baseline, check Cortex for an existing
eval_baselinenode matching the current commit hash. If found, reuse it. After capturing, persist to Cortex for future reuse.Impact
References
atlas-specs/03-EVALUATION.md— eval_baseline caching strategyatlas/evaluation/runner.py—capture_baseline()always runs fresh