Skip to content

Add Mermaid syntax validation for generated diagrams (linter / pre-commit hook) #435

@tractorjuice

Description

@tractorjuice

Problem

ArcKit commands that generate Mermaid diagrams (notably /arckit:data-model, /arckit:diagram, /arckit:dfd, and any artefact embedding a Mermaid block) can emit syntactically invalid Mermaid that the renderer silently fails on — the artefact is written, the docs site indexes it, and the broken diagram is only noticed when a human opens the file.

There is no automated validation in the generation pipeline, the secret-file-scanner hook, or the docs/pages build that catches Mermaid parser errors.

Concrete example (encountered today)

/arckit:data-model for an SaaS project produced an erDiagram containing:

CELL_ASSIGNMENT {
    uuid assignment_id PK
    uuid tenant_id FK UK "One assignment per tenant"
    ...
}
Loading

Mermaid's ER grammar accepts only one key constraint (PK / FK / UK) per attribute. The double FK UK causes the block to fail to render. The artefact still got committed and pushed; the error was only noticed when the user previewed the markdown.

Other classes of mistake the model is prone to in ER diagrams:

  • Two key tokens on one attribute (as above)
  • Reserved-ish words used as attribute names (e.g. command, key, type)
  • Unbalanced quotes inside attribute comments
  • Cardinality typos (e.g. ||--o| vs ||--|o)
  • int/integer/number inconsistency (Mermaid is permissive but downstream consumers may not be)

Suggested approaches (any one of these would help)

  1. Post-generation validation in commands: each command that emits Mermaid extracts the fenced block and runs npx -y @mermaid-js/mermaid-cli -i <tmpfile> -o /dev/null --quiet. Non-zero exit → command refuses to write or surfaces the parser error and asks for a fix-up pass.

  2. PreToolUse:Write hook (mirroring the existing secret-file-scanner pattern): a mermaid-block-scanner.mjs hook extracts every ```mermaid block from the file being written and validates each via mmdc. Same blocking-via-stdout-JSON contract as the secret scanner.

  3. CI / pre-commit hook: a script that walks projects/**/*.md, extracts Mermaid blocks, and validates them. Less immediate feedback but catches every artefact regardless of how it was authored.

  4. Template-level guard rails: add explicit cautionary notes in templates/data-model-template.md and the mermaid-syntax skill (e.g. only one of PK/FK/UK per attribute) so the generator is steered away from common mistakes.

(1) or (2) are highest-value; (4) is cheap and complementary.

Repro

# In any ArcKit project with a REQ document
/arckit:data-model
# Inspect generated ARC-*-DATA-*.md for Mermaid `erDiagram` blocks with `FK UK` or similar double-key attributes

Environment

  • ArcKit plugin v4.14.0
  • Claude Code (Opus 4.7, 1M context)
  • Encountered in tractorjuice/arckit-test-project-v48-arckit-as-a-service on commit a188d67 (the fix commit)

Related

  • Issue issue in mermaid flow #179 (closed) — issue in mermaid flow — different symptom but adjacent area
  • The mermaid-syntax skill exists but is currently only consulted at generation time; there's no validation gate after generation

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions