Skip to content

docs: add Install Route Test Plan and evaluation kit (tests/install-routes/) #61

@aacarter1

Description

@aacarter1

Summary

Add an Install Route Test Plan documentation page plus a small reusable
evaluation kit so that contributors and adopters can compare CodeGuard
delivery models (rule files, Agent Skills, and MCP) in a structured,
repeatable way.

This is intentionally a follow-up to #60, which added the install-path
documentation and the CoSAI personas page. Splitting it out keeps the
docs PR focused on the install routes themselves, and lets this PR ship
the test plan alongside the fixtures and prompts it depends on.

Motivation

The repository already validates rule integrity and generated bundles
(src/validate_unified_rules.py, src/convert_to_ide_formats.py,
.github/workflows/validate-rules.yml). Those checks do not measure:

  • whether a given AI client actually activates CodeGuard guidance
    consistently,
  • whether skills are invoked reliably enough in a target workflow,
  • whether MCP latency or availability is acceptable in practice,
  • whether context efficiency is materially better for a given repo,
  • which delivery mechanism is easiest for a team to operate.

Today there is no consistent, repeatable way for an evaluator to answer
those questions side-by-side across rule files, Agent Skills, and
MCP. This issue tracks closing that gap.

Proposed deliverables

Documentation

  • docs/install-route-test-plan.md (new)

    • Defines what is under test (delivery model, not rule content).
    • Lists evaluation dimensions: activation correctness, scoping
      accuracy, repeatability, latency, context efficiency, installation
      effort, update workflow, offline behavior, auditability, governance
      fit.
    • Provides a 9-test manual suite covering installation verification,
      file-scope activation, security-sensitive generation, false-positive
      control, repeated-run consistency, latency, offline/degraded
      behavior, update workflow, and team reproducibility.
    • Provides a measurement template, decision rubric, lightweight
      automation guidance, and a recommended first-pass workflow.
    • Annotates the audience list (individual / team / organization) and
      the delivery-models table with the responsible CoSAI personas, and
      includes a persona-aligned ownership summary in the decision rubric.
  • Re-add cross-links removed in docs: add CoSAI persona mapping and expand install-path documentation #60 once this page exists:

    • docs/install-paths.md: Install Route Test Plan link in the
      Bottom line section.
    • docs/getting-started.md: Install Route Test Plan and
      tests/install-routes/ references in the Testing the Integration
      section.
    • docs/personas.md: cross-link to the persona-aligned ownership
      summary, plus an entry in Further Reading.
    • mkdocs.yml: Install Route Test Plan: install-route-test-plan.md
      nav entry between Choosing an Install Path and CoSAI Personas.

Evaluation kit (tests/install-routes/)

  • README.md explaining how to use the kit alongside the test plan.
  • prompts/:
    • security-generation.md — password hashing, session handling,
      parameterized data access prompts.
    • security-review.md — review prompts that pair with the language
      fixtures.
    • scope-control.md — code-vs-Markdown comparison prompts.
    • false-positive-control.md — non-security prompts to detect
      over-activation.
  • fixtures/:
    • python/password_storage_review.py — intentionally weak password
      storage example for review tests.
    • javascript/session_login_review.js — intentionally weak login flow
      with concatenated SQL and unsafe cookies for review tests.
    • typescript/input_validation_review.ts — intentionally weak API
      handler with concatenated SQL and missing authorization for review
      tests.
    • markdown/changelog.md — benign changelog used as the
      "should NOT trigger security guidance" control file.
  • results/first-pass-template.md — copy-per-route template capturing
    test metadata, per-test results, scored summary, and a preliminary
    recommendation.

Acceptance criteria

  • docs/install-route-test-plan.md is added and renders cleanly in
    mkdocs serve.
  • tests/install-routes/ directory is added with the README, four
    prompt files, four fixtures, and the results template.
  • Cross-links removed in docs: add CoSAI persona mapping and expand install-path documentation #60 are restored across install-paths.md,
    getting-started.md, personas.md, and mkdocs.yml.
  • The three intentionally insecure fixtures carry a clear header
    comment explaining that they are evaluation bait, so static
    scanners and future readers do not treat the violations as bugs to
    fix.
  • First-pass workflow documented in
    docs/install-route-test-plan.md works end-to-end against at
    least one client/tool using the bundled prompt corpus and
    fixtures.

Out of scope

  • Building a programmatic test harness or CI integration for the
    evaluation kit (the first pass is intentionally human-run; automation
    hooks are noted as a future enhancement in the test plan).
  • Shipping a Project CodeGuard MCP server. The MCP route is evaluated
    against an evaluator's own MCP deployment.
  • Re-validating CodeGuard rule content. This issue evaluates delivery
    mechanisms, not rule correctness.

Related

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions