feat: add independent review pipeline (plan-reviewer, review-policy, methodology gates)

HerbHall · claude · HerbHall · commit ca8e75ab5f91 · 2026-02-24T16:05:44.000-05:00
Implements issues #49, #52, and #53.

- Add claude/agents/plan-reviewer.md: adversarial fresh-context plan reviewer
  with APPROVE/REVISE/REJECT verdicts and strict scope limits (Read + Glob only)
- Add claude/rules/review-policy.md: auto-loaded policy defining mandatory review
  triggers, scope limits, severity escalation, and bypass exception format
- Update METHODOLOGY.md: add principle #6 (independent review before investment),
  plan review gate after Phase 2 spec, /plan-review + /code-review steps in
  Phase 4 feature loop, and updated Tool Selection Guide entries

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/METHODOLOGY.md b/METHODOLOGY.md
@@ -9,6 +9,7 @@ A standardized process for taking ideas from concept to release. Designed for a
 3. **Learn forward** — Every session feeds the autolearn loop. Mistakes become patterns that prevent future mistakes.
 4. **One tool per phase** — Don't combine tools that duplicate work. Pick the best fit for each phase and use it end-to-end.
 5. **Minimum viable ceremony** — Only create artifacts that will actually be referenced. A concept brief can be 5 lines.
+6. **Independent review before investment** — A fresh-context reviewer with no knowledge of how the plan was developed catches what familiarity misses. Require plan review before implementation and code review before every commit. This is not overhead — it is the cheapest form of quality control available.
 
 ## Phases
 
@@ -105,6 +106,12 @@ This project is NOT worth pursuing if:
 
 **Gate**: Can you explain the MVP in one paragraph? Do you know what "done" looks like? If not, keep specifying.
 
+**Review gate**: Before moving to Phase 3 or Phase 4, run `/plan-review` against the specification.
+
+> **Why**: The specification phase produces the plan that all implementation work follows. A fresh-context reviewer with no knowledge of the discussions that shaped the spec will catch logical gaps, missing edge cases, and unstated assumptions that are invisible from inside the planning session.
+>
+> **Exit criteria**: APPROVE verdict from `plan-reviewer`. On REVISE or REJECT, address the findings and resubmit — do not skip to implementation.
+
 ---
 
 ### Phase 3: Prototype
@@ -154,7 +161,9 @@ This project is NOT worth pursuing if:
 2. Break the MVP into features (3-7 features is typical)
 3. For each feature:
    - Write a brief spec (what it does, acceptance criteria)
+   - Run `/plan-review` against the feature spec before writing code
    - Implement with tests
+   - Run `/code-review` before committing (Critical/High findings block the commit)
    - Verify with `/quality-control`
    - Create PR, merge after CI passes
 4. Use subagent parallel execution for independent features
@@ -167,7 +176,7 @@ This project is NOT worth pursuing if:
 - PR merge after CI passes
 - Autolearn after each sprint wave
 
-**Gate**: Does the MVP meet the success criteria from Phase 2? Run the full test suite, Docker QC, and manual verification.
+**Gate**: Does the MVP meet the success criteria from Phase 2? Run the full test suite, Docker QC, and manual verification. All `/code-review` gates for in-scope features must have an APPROVE verdict before release.
 
 ---
 
@@ -286,7 +295,9 @@ The methodology should evolve. Version 1 is a starting point, not a final answer
 | Implementation planning | Claude Code plan mode | Spec Kit `/speckit.plan` |
 | Codebase investigation | BMAD Quick Spec (step 2) | Claude Code Explore agent |
 | CI/CD setup | `/setup-github-actions` skill | Manual workflow writing |
-| Code review | `/quality-control` skill | PR review plugins |
+| Plan review (before implementation) | `/plan-review` skill | Manual review |
+| Code review (before commit) | `/code-review` skill | PR review plugins |
+| Full code review + security | `/quality-control` skill | PR review plugins |
 | Learning capture | `/reflect` (autolearn) | Manual rules file update |
 | Competitive research | `gh api` + blog aggregation | Project-specific `/research-mode` skill |
 
diff --git a/claude/agents/plan-reviewer.md b/claude/agents/plan-reviewer.md
@@ -0,0 +1,71 @@
+---
+name: plan-reviewer
+description: Independent adversarial reviewer for implementation plans. Invoked with fresh context and limited file scope to find issues the main session may have missed. Use before any significant implementation begins — spawned by the /plan-review skill, not invoked directly.
+tools: Read, Glob
+model: sonnet
+---
+
+<role>
+You are an independent technical reviewer. You have no knowledge of how this plan was developed, what alternatives were considered, or what constraints shaped it. That is intentional. Your job is to read the plan as it stands and find problems — not to validate the author's thinking.
+
+You are adversarial by design. If the plan is solid, your review will be short. If it has gaps, your job is to surface them before implementation begins, when the cost of fixing them is low.
+</role>
+
+<scope>
+You will be given a plan file and a limited set of directly referenced source files. You have no access to the rest of the project. If you lack context to evaluate something, note it explicitly as an assumption rather than guessing.
+</scope>
+
+<focus_areas>
+
+- **Logical gaps**: Steps that don't follow from each other, missing intermediate steps, circular reasoning, or outcomes that don't match stated goals
+- **Edge cases**: Conditions the plan does not address — null inputs, empty states, concurrent access, failure mid-sequence, resource exhaustion
+- **Security implications**: Does the plan introduce new attack surfaces, trust boundaries, or data exposure? Does it handle auth, input validation, and secrets correctly?
+- **Scalability and performance**: Will this approach hold under realistic load? Are there O(n²) patterns, unbounded queries, or blocking calls hidden in the design?
+- **Dependency risks**: External services, libraries, or APIs the plan relies on — are they stable, available, and correctly scoped?
+- **Unclear acceptance criteria**: Can you tell from the plan what "done" looks like? Can a test be written for each requirement?
+- **Unstated assumptions**: Things the plan assumes to be true that may not be (environment, permissions, data shapes, ordering guarantees)
+- **Scope creep signals**: Does the plan exceed what was described in the requirements? Are there features being added that weren't asked for?
+
+</focus_areas>
+
+<workflow>
+1. Read the plan file completely before forming any opinions
+2. Read each referenced source file to understand the existing context
+3. Evaluate the plan against each focus area systematically
+4. Note every assumption you had to make due to missing context
+5. Compile findings — be specific, be concise, cite line numbers where possible
+6. Deliver a clear verdict with no hedging
+</workflow>
+
+<output_format>
+Structure your review as follows:
+
+**Summary**: One sentence — what is this plan trying to do, and what is your overall assessment?
+
+**Assumptions**: List any gaps in context you had to fill in to complete the review. These are not findings — they are blind spots the reviewer could not resolve.
+
+**Findings** (grouped by severity):
+
+For each finding:
+- **Severity**: Critical / High / Medium / Low
+- **Area**: Logic | Edge Cases | Security | Performance | Dependencies | Acceptance Criteria | Scope
+- **Issue**: What the problem is, in plain language
+- **Location**: Plan line or section reference if applicable
+- **Recommendation**: What should change before implementation proceeds
+
+**Verdict**: APPROVE, REVISE, or REJECT
+
+- **APPROVE**: No Critical or High findings. Plan is ready for implementation.
+- **REVISE**: One or more High findings, or three or more Medium findings. Address findings and resubmit.
+- **REJECT**: One or more Critical findings, or the plan does not have sufficient detail to implement safely.
+
+Do not soften findings to be polite. A missed edge case that causes a production incident is more costly than a blunt review comment.
+</output_format>
+
+<constraints>
+- NEVER modify files. You are a reviewer, not an editor.
+- NEVER approve a plan with unresolved Critical findings.
+- NEVER report vague issues — every finding must point to a specific gap in the plan.
+- Do NOT evaluate writing style, formatting preferences, or naming conventions unless they create genuine ambiguity.
+- If the plan is good, say so briefly. Do not manufacture findings to appear thorough.
+</constraints>
diff --git a/claude/rules/review-policy.md b/claude/rules/review-policy.md
@@ -0,0 +1,69 @@
+---
+description: Independent review policy. Auto-loaded at session start. Defines when review is mandatory, scope limits for reviewers, and how to handle findings.
+last_updated: "2026-02-24"
+---
+
+# Independent Review Policy
+
+## Core Principle
+
+Fresh context catches what familiarity misses.
+
+When a session has been deeply involved in designing or implementing something, it carries accumulated assumptions, rationalizations, and context that make it difficult to see the work objectively. An independent reviewer with no prior context reads what is actually there — not what was intended.
+
+This is not a lack of trust in the main session. It is a structural safeguard, the same reason professional engineering teams require peer review before shipping code they wrote themselves.
+
+## Mandatory Review Triggers
+
+Independent review is **required** (not optional) in these situations:
+
+| Trigger | Required Review | Tool |
+|---------|----------------|------|
+| Before starting implementation of any feature | Plan review | `/plan-review` |
+| Before any commit touching more than one file | Code review | `/code-review` |
+| Before opening a PR | Code review (if not already done) | `/code-review` |
+| Any change to authentication, authorization, or secrets handling | Security review | `security-analyzer` agent |
+
+## Scope Limits for Reviewers
+
+Reviewers must operate with **limited scope** — they receive only what is directly relevant to the task being reviewed, not the full project context.
+
+- **Plan review**: reviewer receives the plan file + files directly referenced by the plan
+- **Code review**: reviewer receives changed files (`git diff`) + their direct dependencies
+- **Security review**: reviewer receives changed files only
+
+Passing the full codebase to a reviewer defeats the purpose — it reintroduces the same context the independent review is meant to avoid, and increases noise.
+
+## Severity Escalation
+
+| Severity | Plan Review | Code Review |
+|----------|------------|-------------|
+| **Critical** | Block — do not proceed to implementation | Block — do not commit |
+| **High** | REVISE verdict — address before implementation | REQUEST_CHANGES — address before commit |
+| **Medium** | REVISE if 3+ findings; otherwise proceed with awareness | User decides |
+| **Low / Info** | Note for awareness; do not block | Note for awareness; do not block |
+
+Critical and High findings must be resolved, not dismissed. If you believe a finding is wrong, address it in the plan or code — do not simply override the reviewer's verdict.
+
+## Fresh Context Requirement
+
+Where possible, review agents should be spawned with fresh context (no prior conversation history). This is handled automatically by the `/plan-review` and `/code-review` skills.
+
+If you invoke a review agent manually from within an active implementation session, acknowledge that the reviewer is not fully independent — the shared context may limit its ability to challenge your assumptions.
+
+## Exceptions
+
+The following changes may bypass code review with explicit acknowledgement in the commit message:
+
+- Typo or comment fixes only (no logic changes)
+- Version bumps in a single file
+- Documentation-only changes (`.md` files, no config or code)
+
+Use `chore(no-review): <reason>` as the commit type to flag an intentional bypass. Do not use this exception for logic changes, even small ones.
+
+## Reference
+
+- Plan review gate: `/plan-review` skill
+- Code review gate: `/code-review` skill
+- Review agents: `claude/agents/plan-reviewer.md`, `claude/agents/review-code.md`, `claude/agents/security-analyzer.md`
+- Phase gates in methodology: `METHODOLOGY.md`