Skip to content

Adopt patterns from anthropics/financial-services: reader/orchestrator/writer isolation, schema-gated handoffs, reference linting #442

@tractorjuice

Description

@tractorjuice

Summary

Reviewed anthropics/financial-services — Anthropic's reference plugins + Managed Agent cookbooks for FSI workflows (equity research, IB, fund admin, KYC). Several patterns are directly relevant to ArcKit, especially around prompt-injection hardening for our research-heavy commands.

This issue tracks the patterns worth adopting, ranked by impact.

High-impact

1. Three-tier reader / orchestrator / writer isolation for untrusted documents

Every cookbook (e.g. earnings-reviewer, kyc-screener) splits each agent into three tiers documented in the README as a security table:

Tier Touches untrusted docs? Tools Connectors
Reader subagent Yes Read, Grep only None
Orchestrator No Read, Grep, Glob, Agent Read-only MCPs
Writer (sole Write-holder) No Read, Write, Edit None

The reader returns schema-validated, length-capped JSON; the orchestrator (and writer) never see the raw untrusted text. This neutralises prompt-injection from documents the reader ingests.

ArcKit relevance: /arckit:research, /arckit:datascout, /arckit:gov-reuse, /arckit:gov-code-search, /arckit:gov-landscape, /arckit:grants, and the *-research MCP agents all currently read external input (vendor packs, policy docs, web fetches, MCP results) in the same context that writes artefacts. A reader → orchestrator → writer split closes that surface.

2. Schema-gated handoff protocol with allowlist

scripts/orchestrate.py defines ALLOWED_TARGETS (the deployed agent slugs) and a HANDOFF_PAYLOAD_SCHEMA. Agents emit {\"type\":\"handoff_request\",\"target_agent\":\"...\",\"payload\":{...}}; the orchestrator validates target against the allowlist + payload against schema before routing as the next steering event. The script explicitly warns that an attacker-controlled doc could quote a forged handoff blob and documents the mitigations.

ArcKit relevance: Our handoffs: frontmatter is currently a passive "Suggested Next Steps" hint rendered by the converter. Upgrading to active routing with allowlisting would close the same injection vector for chained workflows (e.g. requirements → adr → hld-review).

3. Output schemas on every reader subagent

Reader subagents declare output_schema: in their YAML — JSON Schema with additionalProperties: false, maxItems, maxLength, regex pattern constraints on every field. A validate.py (jsonschema) runs between reader and orchestrator.

Example from transcript-reader.yaml:

output_schema:
  required: [ticker, period, actuals]
  additionalProperties: false
  properties:
    ticker: { type: string, maxLength: 12, pattern: \"^[A-Z.]+$\" }
    guidance_notes:
      type: array
      maxItems: 50
      items: { maxLength: 256, pattern: \"^[A-Za-z0-9 .,%\$()_/:-]+\$\" }

ArcKit relevance: Our requirement IDs (BR-001, FR-xxx, ECX-NN) and Document Control headers are schema-shaped but not enforced. A schema-gated handoff between /arckit:research (reader of vendor packs) and /arckit:evaluate / /arckit:score (writers) would catch silently-malformed structured data and prevent injected fields.

4. check.py with cross-reference resolution

Beyond YAML/JSON parse, scripts/check.py walks every manifest and verifies that every system.file, skills.path, and callable_agents.manifest reference resolves to an existing file. Plus checks agent-plugin bundled skills haven't drifted from the vertical-plugin source.

ArcKit relevance: We have no equivalent reference-resolution lint. A broken ${CLAUDE_PLUGIN_ROOT}/templates/foo.md reference, or a deleted helper script still referenced by a command, currently passes CI. A small lint walking every command/agent/skill against disk would catch refactor breakage early — particularly valuable given the converter generates 6 downstream formats.

5. Headless deployment track via Managed Agents API

Each named agent ships two ways from one source: (a) interactive Cowork plugin, (b) Managed Agent cookbook (POST /v1/agents) for headless/scheduled runs. Same agents/<slug>.md system prompt, different wrapper. The cookbook adds agent.yaml (model, tools, MCP servers, callable subagents), subagents/*.yaml (leaf workers with output schemas), and steering-examples.json.

ArcKit relevance: We ship 7 interactive formats but no headless deployment path. A managed-agent track for /arckit:health, /arckit:navigator, /arckit:graph-report, and autoresearch loops would let firms run ArcKit governance scans on cron without an interactive session. Strategic question — open for discussion.

Medium-impact

6. steering-examples.json per agent

Three canonical trigger events documented per cookbook. ArcKit has argument-hint but no canonical-input test corpus. Worth adding for heavy commands (research, datascout, -research, gov-, grants).

7. Self-contained agent plugins (vendored skills)

plugins/agent-plugins/<slug>/skills/ are vendored copies of plugins/vertical-plugins/*/skills/, kept in sync by sync-agent-skills.py. One install ships everything an agent needs. ArcKit's 5 skills are global to the plugin — fine today, but the pattern is useful if we add command-specific reference skills later.

8. Vertical-level .mcp.json

MCPs declared per vertical (financial-analysis ships FactSet, Daloopa, Morningstar, S&P, Moody's, LSEG, Pitchbook); agents inherit by name. ArcKit declares all 5 MCPs at plugin root. Splitting by jurisdiction overlay (UK gov MCPs vs UAE vs FR vs CA) would let users install only what they need and keep alwaysLoad lean.

9. Strong regulatory disclaimer at README top

Nothing in this repository constitutes investment, legal, tax, or accounting advice. These agents draft analyst work product for review by a qualified professional. They do not make investment recommendations, execute transactions, bind risk, post to a ledger, or approve onboarding; every output is staged for human sign-off.

ArcKit produces DPIAs, EU AI Act assessments, NIS2 conformity, NCSC Secure-by-Design — same regulatory gravity. A prominent "advisory output, requires accountable-officer sign-off" banner would mirror this framing and reduce misuse risk.

Lower-impact / informational

10. Cookbook README convention

Every cookbook README follows the same structure: Overview · Deploy · Steering events · Security & handoffs (tier table). Worth adopting as a template for ArcKit's heavier agents.

11. Dual-runtime guidance in skills

The DCF skill has explicit "if running inside Excel Office Add-in vs generating standalone .xlsx" sections. Analogue for ArcKit: if a skill runs differently under Claude Code vs Codex CLI vs Gemini, document the divergence inline rather than relying on the converter to silently strip.

12. *.local.md gitignored user config

They use markdown sidecars; ArcKit uses userConfig in plugin.json. ArcKit's solution is more structured — no change needed, just noting the difference.

Recommended next steps

In rough priority order:

  • Reader/orchestrator/writer pattern: draft arckit-claude/agents/READER-PATTERN.md reference; pilot by splitting arckit-research (or arckit-datascout) into reader + orchestrator + writer with a schema-validated handoff between tiers
  • Cross-reference linter: scripts/check_references.py walking every command/agent/skill, resolving ${CLAUDE_PLUGIN_ROOT}/... paths, helper scripts, and template references against disk; wire into CI
  • Schema-gated handoffs: extend handoffs: frontmatter to support an output_schema: for the producing command and enforce in any orchestration layer
  • README regulatory disclaimer: add advisory-output banner mirroring the FSI framing
  • Steering-examples corpus: add tests/steering/<command>.json with canonical inputs for the 10 research-heavy commands
  • Strategic discussion: managed-agents headless track for governance-on-cron use cases

References

  • Repo: https://github.com/anthropics/financial-services
  • Reader pattern example: managed-agent-cookbooks/earnings-reviewer/README.md security table
  • Handoff orchestrator: scripts/orchestrate.py
  • Reference lint: scripts/check.py
  • Output schema example: managed-agent-cookbooks/earnings-reviewer/subagents/transcript-reader.yaml

Metadata

Metadata

Assignees

Labels

architectureDesign documents and architectural decisionsenhancementNew feature or requestpriority:highAddress in the next release cycleresearchsecuritySecurity hardening and vulnerability fixes

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions