feat: milestones v1.1–v1.5 — adapters, observability, testing, streaming#4
Open
RichardHightower wants to merge 48 commits intomainfrom
Open
feat: milestones v1.1–v1.5 — adapters, observability, testing, streaming#4RichardHightower wants to merge 48 commits intomainfrom
RichardHightower wants to merge 48 commits intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…bing, and LazyLock caching - CliStatus/CliDiscovery structs for 5 CLIs (claude, opencode, gemini, codex, copilot) - probe_binary uses which + --help (sync std::process::Command, not tokio) - probe_auth with per-CLI strategies (subcommand, env var, or both) - LazyLock DISCOVERY static prints pre-flight summary on first access - Version extraction from --help/--version output Co-Authored-By: Claude Opus 4.6 <[email protected]>
…caching - cli_capabilities.toml with 5 CLI configs (hooks, auto-approve, prompt delivery) - CliCapability struct with Deserialize + convenience methods - load_capabilities() uses include_str! for compile-time embedding - CAPABILITIES LazyLock for once-per-process loading Co-Authored-By: Claude Opus 4.6 <[email protected]>
- SUMMARY.md with execution metrics and self-check - STATE.md updated with position, decisions, and session info Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Fake HOME + XDG_CONFIG_HOME prevents CLIs from reading real user config - Git-initialized workspace with user identity for CLI compatibility - apply_env helpers for std::process and tokio::process Commands - No std::env::set_var -- all overrides via Command::env() only Co-Authored-By: Claude Opus 4.6 <[email protected]>
… wire all Phase 29 modules - Add SKIP_LOG global counter and record_skip() to cli_discovery.rs - Create test_discovery.rs with 11 tests: discovery, capabilities, workspace, skip macros, summary - Wire cli_discovery, cli_capabilities, cli_workspace, test_discovery into e2e.rs - Fix pre-existing test_extract_version_none bug (test string contained "version") - require_cli!, require_cli_auth!, require_capability! macros with skip recording Co-Authored-By: Claude Opus 4.6 <[email protected]>
- 29-02-SUMMARY.md with metrics, decisions, deviation documentation - STATE.md updated: Phase 29 complete (2/2 plans), ready for Phase 30 Co-Authored-By: Claude Opus 4.6 <[email protected]>
5/5 requirements done (DISC-01..04, FAIL-08). 17/17 must-haves verified. 462 tests passing (386 unit + 76 e2e), zero regressions. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Enable research step in GSD workflow config and remove completed cch→rulez renaming todo that is no longer relevant. Co-Authored-By: Claude Opus 4.6 <[email protected]>
…uction - RealCliHarness wraps TestHarness with real GenericCliAdapter from CliAdapterConfig builtins - Supports all 5 CLIs: claude, opencode, gemini, codex, copilot - Provides build_real_registry() for constructing AdapterRegistry with real adapter Co-Authored-By: Claude Opus 4.6 <[email protected]>
…st binary - Add real_cli_harness to Phase 29 infrastructure section - Add test_smoke placeholder module for Phase 30 smoke tests - Both modules compile as part of the e2e test binary Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Created 30-01-SUMMARY.md with execution results - Updated STATE.md position to Phase 30 plan 1/2 Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Three async helper functions: smoke_echo, smoke_file_creation, smoke_model_flag - Module-local skip macros: require_cli!, require_cli_auth!, require_capability! - Structural-only assertions (never assert on AI output content) Co-Authored-By: Claude Opus 4.6 <[email protected]>
- SMOK-01: 5 echo round-trip tests (claude, opencode, gemini, codex, copilot) - SMOK-02: 5 file creation tests verifying marker file written to disk - SMOK-03: 5 model flag passthrough tests verifying history entry records model - zzz_smoke_skip_summary test prints accumulated skip table - All 15 tests are #[ignore] and gated by require_cli_auth! macro Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Phase 30 complete: RealCliHarness infrastructure + 15 smoke tests (5 CLIs x 3 requirements: echo, file creation, model flag). All 8/8 must-haves verified. 77 e2e tests pass, 15 ignored (real CLI). Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
- 5 FAIL-05 tests: missing binary produces Crashed state for all 5 CLI adapters - 5 FAIL-06 tests: auth failure produces Failed state for all 5 CLI adapters - Shared helpers: adapter_config_for(), fail05_missing_binary(), fail06_auth_failure() - Module registered in e2e.rs Co-Authored-By: Claude Opus 4.6 <[email protected]>
…dapters - 5 FAIL-07 tests: SIGTERM-resistant script with 2s timeout produces Timeout state - Proves SIGTERM->SIGKILL escalation: elapsed >= 3s (timeout + grace period) - Uses trap '' TERM with busy-wait pattern to resist SIGTERM - All 15 tests pass together, full suite has no regressions Co-Authored-By: Claude Opus 4.6 <[email protected]>
- SUMMARY.md with 15 per-adapter failure mode tests across FAIL-05/06/07 - STATE.md updated: Phase 31 complete, metrics recorded Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add AGCRON_SKIP::{cli}::{reason} stdout marker in record_skip() for report parsing
- Add quick-junit = "0.5" to [dependencies] for JUnit XML generation
- Add [[bin]] target for test-report binary
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…nit output - Parse cargo test JSON line-by-line for test results - Detect skips via AGCRON_SKIP:: stdout marker (not just event status) - Generate test-results.json with CLI x scenario matrix (REPT-01) - Generate colored terminal table with per-CLI tallies (REPT-02) - Generate test-results.xml JUnit XML with one suite per CLI (REPT-03) - Dynamic discovery of CLIs and scenarios from test output - 15 unit tests covering parsing, skip detection, and output generation Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
- GitHub Actions workflow with 3AM UTC cron schedule and manual dispatch - Per-CLI API key secrets (Anthropic, OpenAI, Gemini, GitHub) - Nightly Rust toolchain for --format json test output - JUnit report via mikepenz/action-junit-report with test-results.xml - Matrix summary written to GITHUB_STEP_SUMMARY and stdout - Artifact upload on failure (JSON, XML, logs, 14-day retention) - Fork guard prevents runs on non-SpillwaveSolutions repos - Copilot browser OAuth skip documented in header comments Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Summary documenting workflow creation and verification - STATE.md updated: Phase 32 complete, v1.5 milestone complete Co-Authored-By: Claude Opus 4.6 <[email protected]>
…verified Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Closes critical audit gap: Phase 31's 15 failure tests lacked #[ignore], excluding them from CI pipeline's --ignored run. Adds Phase 33 to restore FAIL-05/06/07 and CIPL-04 coverage in nightly CI. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Annotated 5 FAIL-05 (missing binary) tests with #[ignore] - Annotated 5 FAIL-06 (auth failure) tests with #[ignore] - Annotated 5 FAIL-07 (timeout SIGKILL) tests with #[ignore] - Tests now visible to CI pipeline's cargo test -- --ignored run Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Created 33-01-SUMMARY.md with execution results - Updated STATE.md with phase 33 position and decisions Co-Authored-By: Claude Opus 4.6 <[email protected]>
- REQUIREMENTS.md: check all v1.5 requirement boxes (were unchecked despite Done in traceability) - ROADMAP.md: update v1.5 progress row (was TBD/In progress, now 8 plans/Complete/2026-03-05) - STATE.md: fix velocity metrics (v1.5 not v1.4, 8 plans, ~2.5min avg not ~12min) - PROJECT.md: v1.5 now shown as completed, not current/starting - MILESTONES.md: add v1.5 section - v1.5-MILESTONE-AUDIT.md: status passed (was gaps_found), GAP-1 closed by Phase 33 - Remove empty phases/18-e2e-test-infrastructure directory Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Archive ROADMAP.md → milestones/v1.5-ROADMAP.md - Archive REQUIREMENTS.md → milestones/v1.5-REQUIREMENTS.md - Move MILESTONE-AUDIT.md → milestones/v1.5-MILESTONE-AUDIT.md - Delete REQUIREMENTS.md (fresh for next milestone) - Collapse ROADMAP.md v1.5 into <details> block - Evolve PROJECT.md: v1.5 requirements validated, key decisions added - Update STATE.md: milestone complete, ready for next Co-Authored-By: Claude Opus 4.6 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
logs show/logs tail), live monitoring (watchdashboard), alerting & webhook notifications, follow-log path alignment130 files changed, ~23k lines added across Rust source, E2E tests, and planning docs. All 269 tests pass (189 unit + 76 E2E + 4 doc-tests).
Test plan
cargo test— 269 passed, 0 failed🤖 Generated with Claude Code