Goal: run the first recurring Research Radar trial week before selecting v0.8.
This is an operational recurring research process, not repo automation.
Scope:
- no new task family;
- no automation scripts in the public repo;
- no public scheduled jobs;
- no scraping or collectors in the repo;
- no private eval material;
- no customer data;
- no raw feeds committed to public repo;
- no automatic GitHub issues unless explicitly approved.
Cadence:
- weekday morning Daily Research Radar brief;
- weekly synthesis checkpoint after the daily briefs;
- monthly watchlist pruning later, only if the trial is useful.
Daily brief rules:
- use v0.7 Research Radar watchlist, source map, repos, people, and query sets;
- focus on benchmark mechanics, not generic AI news;
- output brief only, not repo changes;
- recommended actions max 5;
- answer: Should Agent Bench Lab change direction today?
Weekly synthesis rules:
- review daily briefs;
- decide keep / modify / defer / stop;
- propose issues or decision-log entries;
- do not choose v0.8 from inertia.
Success criteria:
- 3 daily briefs completed;
- 1 weekly synthesis completed;
- weekly synthesis explicitly recommends keep / modify / defer / stop for each v0.8 candidate;
- no repo writes from daily briefs;
- no auto-created issues without approval;
- no private eval material or customer data.
Possible v0.8 candidates:
- SEC-01 decision-grade / security overlay;
- Private Eval Bundle manifest validator;
- Report schema runtime v1;
- lightweight Research Radar automation;
- browser/replay standard;
- MCP-local extension;
- nothing yet.
Decision rule:
Do not pick v0.8 until weekly synthesis is complete.
Goal: run the first recurring Research Radar trial week before selecting v0.8.
This is an operational recurring research process, not repo automation.
Scope:
Cadence:
Daily brief rules:
Weekly synthesis rules:
Success criteria:
Possible v0.8 candidates:
Decision rule:
Do not pick v0.8 until weekly synthesis is complete.