Skip to content

custom-stack examples PR 1: compliance-release manifest + static contract#203

Merged
garagon merged 3 commits intomainfrom
cse-1-stack-manifest-contract
Apr 26, 2026
Merged

custom-stack examples PR 1: compliance-release manifest + static contract#203
garagon merged 3 commits intomainfrom
cse-1-stack-manifest-contract

Conversation

@garagon
Copy link
Copy Markdown
Owner

@garagon garagon commented Apr 26, 2026

Summary

First PR of the Custom Stack Examples v1 round. The framework round (PRs #196-#202) proved a single skill works end-to-end. This round proves a stack — multiple custom skills composed into a domain workflow that gates /ship.

PR 1 lands the structural surface for the first stack (compliance-release: license-audit + privacy-check + release-readiness composer) plus the static contract that locks every stack to the same shape. The skill helpers in this PR are placeholders that emit valid JSON; PR 2 replaces them with real behavior, and PR 3 adds the runtime end-to-end harness. PR 4 updates the public README + EXTENDING.md once the harness proves the framework story.

Layout

examples/custom-stack-template/
  README.md                     # overview, links per-stack readmes
  compliance-release/
    README.md                   # 6 required H2 sections, 4 required tokens
    stack.json                  # kind=custom_stack_example schema
    skills/
      license-audit/
        SKILL.md
        agents/openai.yaml
        bin/audit.sh            # placeholder; PR 2 wires real
        bin/smoke.sh
      privacy-check/...
      release-readiness/...

Static contract — ci/check-custom-stack-examples.sh

Validates every stack folder it finds. Current run on compliance-release/: 49 checks pass.

  • Manifest schema: kind == "custom_stack_example", schema_version == "1", manifest name + every skills[].name match the phase regex, no duplicate skill names.
  • Skill folder shape: SKILL.md + agents/openai.yaml + bin/smoke.sh + at least one work-helper besides smoke.sh. Directory basename matches the manifest's skills[].name. SKILL.md frontmatter name: and concurrency: match the manifest. agents/openai.yaml has display_name + short_description + default_prompt. bash -n passes on every bin/*.sh.
  • phase_graph integrity:
    • every node is in core ∪ build ∪ declared skills (no orphans);
    • every depends_on[] target references a name that appears in the graph (no dangling deps);
    • phase_graph[].name are unique (no duplicates);
    • every declared skill appears in the graph (no shipped-but-unscheduled skill);
    • the graph is acyclic — Kahn's algorithm reduction (no cycles, including self-loops).
  • Composer-precedes-ship rule: when a stack ships a release-readiness skill, ship must depend on it (not directly on the review/security/qa trio).
  • README structure: the six required H2 sections (Who this stack is for, What it adds, Install in a sandbox, Run the workflow, Expected evidence, Reset) and four required tokens (bin/create-skill.sh, bin/check-custom-skill.sh, conductor/bin/sprint.sh, release-readiness).
  • No committed runtime artifacts: .nanostack/, node_modules/, .env*, credential JSON basenames already blocked by the bash guard (G-035), logs.

The lint job custom-stack-examples-contract runs the checker on every PR.

Stack README install hygiene

Every install / wire / validate / run / reset block sources $NANOSTACK_ROOT/bin/lib/store-path.sh and uses $NANOSTACK_STORE, matching the same priority order bin/create-skill.sh uses (explicit env var > git repo root > $HOME/.nanostack/). A user with NANOSTACK_STORE exported, or running from any cwd inside or outside git, never sees the install split across two stores.

The --from paths in bin/create-skill.sh invocations are anchored at $NANOSTACK_ROOT/examples/custom-stack-template/compliance-release/skills/<name> so the install runs from any sandbox project, not only from the Nanostack repo root.

Test plan

  • tests/run.sh: 83/83
  • ci/e2e-user-flows.sh: 100/100
  • ci/e2e-custom-stack-flows.sh: 30/30
  • ci/check-custom-stack-examples.sh: 49/49 (new)
  • ci/e2e-think-flows.sh: 32/32
  • ci/e2e-think-archetypes.sh: 25/25
  • ci/e2e-onboarding-flows.sh: 34/34
  • ci/e2e-delivery-matrix.sh: 17/17
  • ci/check-examples.sh: 32/32
  • YAML parses
  • Each skill's placeholder smoke passes
  • Em-dash sweep on every public-copy file (top-level *.md and examples/**/README.md)
  • Sabotage tests for the new graph contract: dangling depends_on, duplicate phase_graph names, skill omitted from graph, 2-node cycle, self-loop, 3-node cycle — all caught

Public claim discipline

The compliance-release README.md explicitly documents that the install commands run once PR 3 ships, so the public claim never runs ahead of the harness. The framework spec's "do not reposition the README hero before runtime E2E lands" rule is honored.

Out of scope (PRs 2-4 of the round)

  • PR 2: real skill behavior (license-audit/bin/audit.sh classifies dependency licenses; privacy-check/bin/check.sh scans for collection signals + missing privacy notes; release-readiness/bin/summarize.sh composes the upstream artifacts into a status).
  • PR 3: ci/e2e-custom-stack-examples.sh — 15-cell runtime harness with ≥35 assertions, including subdir + no-git scaffold paths.
  • PR 4: README + README.es + EXTENDING.md "build your own workflow stack" wording, only after the runtime harness proves it.

garagon added 3 commits April 26, 2026 16:00
…ract

First PR of the Custom Stack Examples v1 round. The framework round
(PRs #196-#202) proved a single skill works end-to-end. This round
proves a stack — multiple custom skills composed into a domain
workflow that gates /ship.

PR 1 lands the structural surface for the first stack
(compliance-release: license-audit + privacy-check +
release-readiness composer) plus the static contract that locks
every stack to the same shape:

  examples/custom-stack-template/
    README.md                     # overview, links per-stack readmes
    compliance-release/
      README.md                   # 6 H2 sections, 4 required tokens
      stack.json                  # kind=custom_stack_example schema
      skills/
        license-audit/
          SKILL.md
          agents/openai.yaml
          bin/audit.sh            # placeholder (PR 2 wires real)
          bin/smoke.sh
        privacy-check/...
        release-readiness/...

PR 2 of this round wires the real skill behavior. PR 3 adds the
runtime end-to-end harness. PR 4 updates the public README + EXTENDING
docs once the harness proves the framework story.

ci/check-custom-stack-examples.sh validates every stack folder it
finds:
  - manifest schema (kind=custom_stack_example, schema_version=1,
    name + skill names match phase regex, no duplicate skill names)
  - skill folder shape (SKILL.md + agents/openai.yaml + bin/smoke.sh
    + at least one work-helper besides smoke; basename matches
    manifest name; frontmatter name + concurrency match manifest;
    openai.yaml has display_name + short_description + default_prompt;
    bash -n on every bin/*.sh)
  - phase_graph membership (every node is core ∪ build ∪ declared
    skill, no orphan names)
  - ship depends on release-readiness when the stack ships a
    release-readiness skill (composer-must-precede-ship rule)
  - README has the 6 required H2 sections and 4 required tokens
    (bin/create-skill.sh, bin/check-custom-skill.sh,
    conductor/bin/sprint.sh, release-readiness)
  - no committed runtime artifacts (.nanostack/, node_modules/,
    .env*, credential JSON basenames blocked by the guard, logs)

Lint job custom-stack-examples-contract runs the checker on every
PR. Current contract: 45 checks pass on the compliance-release
stack.

The skill helpers in this PR are placeholders that emit a JSON
object satisfying the smoke check. PR 2 replaces the bin/audit.sh,
bin/check.sh, and bin/summarize.sh bodies with real classification
+ scanning + composition logic.

The compliance-release/README.md explicitly documents that the
install commands run once PR 3 ships, so the public claim never
runs ahead of the harness.
…ed install

Codex's PR 1 review caught three real issues.

P1: CI red on em-dashes in public copy. The em-dash lint scans
top-level *.md and examples/**/README.md; both new READMEs had
em-dashes left over from Markdown habit. Replaced with commas,
colons, semicolons, and parentheses. The reference spec doc keeps
its em-dashes because reference/ is not in the lint scope.

P2.1: phase_graph contract was structural-only. The checker
validated phase_graph[].name membership but not depends_on[]
targets, duplicate names, or skill-graph membership. A stack could
ship with a misspelled depends_on or with a declared skill missing
from the graph and the failure would only surface at PR 3 runtime.
Added three checks (10a, 10b, 10c):
  - Every depends_on[] target must reference a name that appears in
    phase_graph[]. Sabotaged with a bogus dep -> caught.
  - phase_graph[].name must be unique. Sabotaged with a duplicate
    'think' -> caught.
  - Every skills[].name must appear in phase_graph[]. Sabotaged by
    removing license-audit from the graph -> caught (and the new
    dangling-deps check fires too because release-readiness still
    declared the dep).

P2.2: install commands were not self-contained. The README told
users to run `bin/create-skill.sh --from examples/custom-stack-
template/...` but those paths only resolve from the Nanostack
checkout, not from the user's sandbox. Rewrote every install +
validate + run + reset block to use NANOSTACK_ROOT (with a
$HOME/.claude/skills/nanostack default) and a STORE expression that
mirrors lib/store-path.sh (`$(git rev-parse --show-toplevel ||
$HOME)/.nanostack`). The path is now explicit and runs from any
cwd, with or without git, without requiring the user to be inside
the Nanostack repo.

Contract grows from 45 to 48 checks. All other suites unchanged
(83 unit, 100 user-flows e2e, 30 custom-stack e2e, 25 think
archetypes, 32 think, 34 onboarding, 17 delivery, 32 examples).
Codex's PR 1 re-review caught two more issues; both real.

P2.1: README's store resolution diverged from create-skill.sh.
  bin/lib/store-path.sh gives an explicit NANOSTACK_STORE env var
  the highest priority, then git repo root, then $HOME/.nanostack.
  The README used `$(git rev-parse --show-toplevel || $HOME)`,
  which silently skipped the env-var case. A user or harness with
  NANOSTACK_STORE exported would have create-skill.sh write to the
  exported store while the README's wire-the-graph and validate
  blocks operated on a different store. Fix: every snippet now
  sources `$NANOSTACK_ROOT/bin/lib/store-path.sh` and uses
  $NANOSTACK_STORE, the same path the scaffolder wrote to.

P2.2: Static contract accepted cyclic phase_graphs. The 10a/10b/10c
  checks closed dangling deps, duplicates, and omitted skills, but a
  cycle (e.g. license-audit -> privacy-check -> license-audit) still
  passed. Conductor's runtime validator would reject the cycle, so a
  stack could ship past CI here yet fail at PR 3 runtime E2E.
  Added check 10d: reuses the same Kahn's-algorithm jq filter from
  bin/lib/phases.sh _nano_phase_graph_is_valid. Sabotage tests
  confirmed catches for 2-node cycle, self-loop, and 3-node cycle.

Contract grows from 48 to 49 checks. All other suites unchanged.
@garagon garagon merged commit c86fd67 into main Apr 26, 2026
52 checks passed
@garagon garagon deleted the cse-1-stack-manifest-contract branch April 26, 2026 19:33
garagon added a commit that referenced this pull request Apr 26, 2026
#205)

Third PR of the Custom Stack Examples v1 round. PR 1 #203 landed the
manifest + 49-check static contract; PR 2 #204 wired the real skill
behavior. This PR adds ci/e2e-custom-stack-examples.sh, the runtime
contract that proves the compliance-release stack composes
end-to-end on a real /tmp project.

15 cells, 51 assertions, no network:

  [1]  fixture project (Node app, README, .env.example, src with
       email + name fields, MIT lodash under node_modules)
  [2]  install all three skills via bin/create-skill.sh --from with
       the documented --concurrency + --depends-on flags; assert
       skills land in $NANOSTACK_STORE/skills and config registers
       all three phases
  [3]  bin/check-custom-skill.sh validates each scaffolded skill
  [4]  save fake review/qa/security artifacts via save-artifact.sh
       (real .integrity hashes)
  [5]  run license-audit/bin/audit.sh + save artifact; assert MIT
       lodash classifies as permissive
  [6]  run privacy-check/bin/check.sh + save artifact; assert email
       AND name signals are detected, missing privacy_note flagged
  [7]  bin/resolve.sh release-readiness; assert phase_kind=custom
       and all five declared upstream keys resolve to a path
  [8]  run release-readiness/bin/summarize.sh; assert rollup is WARN
       (privacy-check is WARN, others OK), all five checks present
  [9]  bin/sprint-journal.sh emits ## /<phase> sections for all
       three custom phases
  [10] bin/analytics.sh --json counts the three under sprints.custom
       and reports custom_total >= 3
  [11] bin/discard-sprint.sh --dry-run lists all three custom files
  [12] conductor sprint.sh start with the stack.json phase_graph;
       assert sprint has 10 nodes (think+plan+build+review+qa+
       security+3 custom+ship)
  [13] conductor sprint.sh batch; assert build precedes
       license-audit and privacy-check, both schedule as type=read,
       release-readiness follows after both, ship follows after
       release-readiness
  [14] scaffold from a git subdirectory; assert skills land in repo
       root .nanostack/skills (not subdir/.nanostack)
  [15] scaffold without git (fake HOME); assert skills land in
       $HOME/.nanostack/skills (not cwd)

The new e2e-custom-stack-examples GitHub Actions job runs the
harness on workflow_dispatch, alongside the existing
e2e-custom-stack and the rest of the e2e jobs.

Bug fix surfaced by the harness: privacy-check's source-tree
scanner used `head -1` on per-line token extraction, so a single
line that collected both `email` and `name` (the e2e fixture's
src/signup.js) only reported the first match. The smoke case 2
asserted `any` rather than `all`, so the bug was invisible until
the runtime harness asserted both signals explicitly.

Fix: scan_personal_data and scan_telemetry now emit one signal per
unique matching token in each line. The PR 2 smoke cases for
name-only and ga continue to pass; the runtime e2e assertion that
both email and name surface from the same line now passes too.
garagon added a commit that referenced this pull request Apr 26, 2026
* custom-stack-examples PR 4: public copy positions stack story

Final PR of the Custom Stack Examples v1 round. PR 1 #203 landed
the manifest + static contract. PR 2 #204 wired the real skill
behavior. PR 3 #205 added the 51-assertion runtime harness. With
the harness green, the framework spec's "do not reposition the
README hero before runtime E2E lands" rule unblocks. This PR
updates the public copy to position the stack story without
overclaim.

README.md "Build on nanostack" splits into Single skill + Workflow
stack subsections. The single-skill block stays as-is (proven by
ci/e2e-custom-stack-flows.sh). The new workflow-stack block points
at examples/custom-stack-template/compliance-release/ as the
reference shape, names the three skills (license-audit +
privacy-check + release-readiness composer), claims the
compliance-release example proves save / resolve / journal /
analytics / discard / conductor compose, and names the harness
that proves it (ci/e2e-custom-stack-examples.sh, 15 cells, 51
assertions).

README.es.md gets the same shape under "Construí tu propio
workflow stack" so Spanish parity holds. Same tokens, same claims,
no new claims English-only.

EXTENDING.md "Quickest way to start" expands to a two-row table:
single-skill via examples/custom-skill-template/audit-licenses, vs.
workflow stack via examples/custom-stack-template/compliance-release.
Each row has its own quickstart block. Both link to their proving
harness.

The compliance-release/README.md status banner switches from "PR
1, install commands run once PR 3 ships" to "end-to-end working,
49 contract + 51 runtime assertions". The custom-stack-template
overview README replaces its PR-by-PR breakdown with a "what's
covered" summary that names every harness behind the claim.

Lint adds custom-stack-examples-public-copy:
  - README.md + README.es.md each mention compliance-release,
    phase_graph, workflow stack, ci/e2e-custom-stack-examples.sh,
    and the example path.
  - EXTENDING.md links both starting points (single-skill template
    and the stack template), the spec doc, and the runtime harness.
  - Public copy does NOT contain disallowed phrases:
    marketplace, plugin ecosystem, GDPR ready, SOC2 ready,
    compliance certified, "works in every agent identically".

After this lands, Custom Stack Examples v1 is complete and the
round closes.

* docs: drop em dash + scope claim to opt-in E2E workflow

Codex's PR 4 review caught two real items.

P1 CI red: examples/custom-stack-template/README.md line 47 used
an em dash before "all proven by ci/e2e-custom-stack-examples.sh".
The "No em-dashes in public copy" lint scans top-level *.md plus
every examples/**/README.md, and this line tripped it. Replaced
with a period.

P2 overclaim: three sites said the runtime harness runs "on every
workflow run" or implied auto-on-PR coverage. The harness lives in
.github/workflows/e2e.yml as a workflow_dispatch job, not part of
the on-PR lint matrix, so the claim does not match the actual
config. Softened to "in the opt-in E2E workflow":

- README.md "Build on nanostack" workflow-stack subsection.
- README.es.md "Construí tu propio workflow stack" subsection (same
  shape, same scope language).
- examples/custom-stack-template/compliance-release/README.md
  status banner: now distinguishes the static contract (runs on
  every PR) from the runtime harness (runs in the opt-in E2E
  workflow).
- examples/custom-stack-template/README.md "what's covered" list.

The custom-stack-examples-public-copy lock added in PR 4 still
passes (it requires the harness path token to appear, not a
specific frequency claim). Em-dash sweep clean across every file
in scope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant