Summary
Reviewed anthropics/financial-services — Anthropic's reference plugins + Managed Agent cookbooks for FSI workflows (equity research, IB, fund admin, KYC). Several patterns are directly relevant to ArcKit, especially around prompt-injection hardening for our research-heavy commands.
This issue tracks the patterns worth adopting, ranked by impact.
High-impact
1. Three-tier reader / orchestrator / writer isolation for untrusted documents
Every cookbook (e.g. earnings-reviewer, kyc-screener) splits each agent into three tiers documented in the README as a security table:
| Tier |
Touches untrusted docs? |
Tools |
Connectors |
| Reader subagent |
Yes |
Read, Grep only |
None |
| Orchestrator |
No |
Read, Grep, Glob, Agent |
Read-only MCPs |
Writer (sole Write-holder) |
No |
Read, Write, Edit |
None |
The reader returns schema-validated, length-capped JSON; the orchestrator (and writer) never see the raw untrusted text. This neutralises prompt-injection from documents the reader ingests.
ArcKit relevance: /arckit:research, /arckit:datascout, /arckit:gov-reuse, /arckit:gov-code-search, /arckit:gov-landscape, /arckit:grants, and the *-research MCP agents all currently read external input (vendor packs, policy docs, web fetches, MCP results) in the same context that writes artefacts. A reader → orchestrator → writer split closes that surface.
2. Schema-gated handoff protocol with allowlist
scripts/orchestrate.py defines ALLOWED_TARGETS (the deployed agent slugs) and a HANDOFF_PAYLOAD_SCHEMA. Agents emit {\"type\":\"handoff_request\",\"target_agent\":\"...\",\"payload\":{...}}; the orchestrator validates target against the allowlist + payload against schema before routing as the next steering event. The script explicitly warns that an attacker-controlled doc could quote a forged handoff blob and documents the mitigations.
ArcKit relevance: Our handoffs: frontmatter is currently a passive "Suggested Next Steps" hint rendered by the converter. Upgrading to active routing with allowlisting would close the same injection vector for chained workflows (e.g. requirements → adr → hld-review).
3. Output schemas on every reader subagent
Reader subagents declare output_schema: in their YAML — JSON Schema with additionalProperties: false, maxItems, maxLength, regex pattern constraints on every field. A validate.py (jsonschema) runs between reader and orchestrator.
Example from transcript-reader.yaml:
output_schema:
required: [ticker, period, actuals]
additionalProperties: false
properties:
ticker: { type: string, maxLength: 12, pattern: \"^[A-Z.]+$\" }
guidance_notes:
type: array
maxItems: 50
items: { maxLength: 256, pattern: \"^[A-Za-z0-9 .,%\$()_/:-]+\$\" }
ArcKit relevance: Our requirement IDs (BR-001, FR-xxx, ECX-NN) and Document Control headers are schema-shaped but not enforced. A schema-gated handoff between /arckit:research (reader of vendor packs) and /arckit:evaluate / /arckit:score (writers) would catch silently-malformed structured data and prevent injected fields.
4. check.py with cross-reference resolution
Beyond YAML/JSON parse, scripts/check.py walks every manifest and verifies that every system.file, skills.path, and callable_agents.manifest reference resolves to an existing file. Plus checks agent-plugin bundled skills haven't drifted from the vertical-plugin source.
ArcKit relevance: We have no equivalent reference-resolution lint. A broken ${CLAUDE_PLUGIN_ROOT}/templates/foo.md reference, or a deleted helper script still referenced by a command, currently passes CI. A small lint walking every command/agent/skill against disk would catch refactor breakage early — particularly valuable given the converter generates 6 downstream formats.
5. Headless deployment track via Managed Agents API
Each named agent ships two ways from one source: (a) interactive Cowork plugin, (b) Managed Agent cookbook (POST /v1/agents) for headless/scheduled runs. Same agents/<slug>.md system prompt, different wrapper. The cookbook adds agent.yaml (model, tools, MCP servers, callable subagents), subagents/*.yaml (leaf workers with output schemas), and steering-examples.json.
ArcKit relevance: We ship 7 interactive formats but no headless deployment path. A managed-agent track for /arckit:health, /arckit:navigator, /arckit:graph-report, and autoresearch loops would let firms run ArcKit governance scans on cron without an interactive session. Strategic question — open for discussion.
Medium-impact
6. steering-examples.json per agent
Three canonical trigger events documented per cookbook. ArcKit has argument-hint but no canonical-input test corpus. Worth adding for heavy commands (research, datascout, -research, gov-, grants).
7. Self-contained agent plugins (vendored skills)
plugins/agent-plugins/<slug>/skills/ are vendored copies of plugins/vertical-plugins/*/skills/, kept in sync by sync-agent-skills.py. One install ships everything an agent needs. ArcKit's 5 skills are global to the plugin — fine today, but the pattern is useful if we add command-specific reference skills later.
8. Vertical-level .mcp.json
MCPs declared per vertical (financial-analysis ships FactSet, Daloopa, Morningstar, S&P, Moody's, LSEG, Pitchbook); agents inherit by name. ArcKit declares all 5 MCPs at plugin root. Splitting by jurisdiction overlay (UK gov MCPs vs UAE vs FR vs CA) would let users install only what they need and keep alwaysLoad lean.
9. Strong regulatory disclaimer at README top
Nothing in this repository constitutes investment, legal, tax, or accounting advice. These agents draft analyst work product for review by a qualified professional. They do not make investment recommendations, execute transactions, bind risk, post to a ledger, or approve onboarding; every output is staged for human sign-off.
ArcKit produces DPIAs, EU AI Act assessments, NIS2 conformity, NCSC Secure-by-Design — same regulatory gravity. A prominent "advisory output, requires accountable-officer sign-off" banner would mirror this framing and reduce misuse risk.
Lower-impact / informational
10. Cookbook README convention
Every cookbook README follows the same structure: Overview · Deploy · Steering events · Security & handoffs (tier table). Worth adopting as a template for ArcKit's heavier agents.
11. Dual-runtime guidance in skills
The DCF skill has explicit "if running inside Excel Office Add-in vs generating standalone .xlsx" sections. Analogue for ArcKit: if a skill runs differently under Claude Code vs Codex CLI vs Gemini, document the divergence inline rather than relying on the converter to silently strip.
12. *.local.md gitignored user config
They use markdown sidecars; ArcKit uses userConfig in plugin.json. ArcKit's solution is more structured — no change needed, just noting the difference.
Recommended next steps
In rough priority order:
References
- Repo: https://github.com/anthropics/financial-services
- Reader pattern example:
managed-agent-cookbooks/earnings-reviewer/README.md security table
- Handoff orchestrator:
scripts/orchestrate.py
- Reference lint:
scripts/check.py
- Output schema example:
managed-agent-cookbooks/earnings-reviewer/subagents/transcript-reader.yaml
Summary
Reviewed anthropics/financial-services — Anthropic's reference plugins + Managed Agent cookbooks for FSI workflows (equity research, IB, fund admin, KYC). Several patterns are directly relevant to ArcKit, especially around prompt-injection hardening for our research-heavy commands.
This issue tracks the patterns worth adopting, ranked by impact.
High-impact
1. Three-tier reader / orchestrator / writer isolation for untrusted documents
Every cookbook (e.g.
earnings-reviewer,kyc-screener) splits each agent into three tiers documented in the README as a security table:Read,GreponlyRead,Grep,Glob,AgentWrite-holder)Read,Write,EditThe reader returns schema-validated, length-capped JSON; the orchestrator (and writer) never see the raw untrusted text. This neutralises prompt-injection from documents the reader ingests.
ArcKit relevance:
/arckit:research,/arckit:datascout,/arckit:gov-reuse,/arckit:gov-code-search,/arckit:gov-landscape,/arckit:grants, and the*-researchMCP agents all currently read external input (vendor packs, policy docs, web fetches, MCP results) in the same context that writes artefacts. A reader → orchestrator → writer split closes that surface.2. Schema-gated handoff protocol with allowlist
scripts/orchestrate.pydefinesALLOWED_TARGETS(the deployed agent slugs) and aHANDOFF_PAYLOAD_SCHEMA. Agents emit{\"type\":\"handoff_request\",\"target_agent\":\"...\",\"payload\":{...}}; the orchestrator validates target against the allowlist + payload against schema before routing as the next steering event. The script explicitly warns that an attacker-controlled doc could quote a forged handoff blob and documents the mitigations.ArcKit relevance: Our
handoffs:frontmatter is currently a passive "Suggested Next Steps" hint rendered by the converter. Upgrading to active routing with allowlisting would close the same injection vector for chained workflows (e.g.requirements → adr → hld-review).3. Output schemas on every reader subagent
Reader subagents declare
output_schema:in their YAML — JSON Schema withadditionalProperties: false,maxItems,maxLength, regexpatternconstraints on every field. Avalidate.py(jsonschema) runs between reader and orchestrator.Example from
transcript-reader.yaml:ArcKit relevance: Our requirement IDs (
BR-001,FR-xxx,ECX-NN) and Document Control headers are schema-shaped but not enforced. A schema-gated handoff between/arckit:research(reader of vendor packs) and/arckit:evaluate//arckit:score(writers) would catch silently-malformed structured data and prevent injected fields.4.
check.pywith cross-reference resolutionBeyond YAML/JSON parse,
scripts/check.pywalks every manifest and verifies that everysystem.file,skills.path, andcallable_agents.manifestreference resolves to an existing file. Plus checks agent-plugin bundled skills haven't drifted from the vertical-plugin source.ArcKit relevance: We have no equivalent reference-resolution lint. A broken
${CLAUDE_PLUGIN_ROOT}/templates/foo.mdreference, or a deleted helper script still referenced by a command, currently passes CI. A small lint walking every command/agent/skill against disk would catch refactor breakage early — particularly valuable given the converter generates 6 downstream formats.5. Headless deployment track via Managed Agents API
Each named agent ships two ways from one source: (a) interactive Cowork plugin, (b) Managed Agent cookbook (
POST /v1/agents) for headless/scheduled runs. Sameagents/<slug>.mdsystem prompt, different wrapper. The cookbook addsagent.yaml(model, tools, MCP servers, callable subagents),subagents/*.yaml(leaf workers with output schemas), andsteering-examples.json.ArcKit relevance: We ship 7 interactive formats but no headless deployment path. A managed-agent track for
/arckit:health,/arckit:navigator,/arckit:graph-report, and autoresearch loops would let firms run ArcKit governance scans on cron without an interactive session. Strategic question — open for discussion.Medium-impact
6.
steering-examples.jsonper agentThree canonical trigger events documented per cookbook. ArcKit has
argument-hintbut no canonical-input test corpus. Worth adding for heavy commands (research, datascout, -research, gov-, grants).7. Self-contained agent plugins (vendored skills)
plugins/agent-plugins/<slug>/skills/are vendored copies ofplugins/vertical-plugins/*/skills/, kept in sync bysync-agent-skills.py. One install ships everything an agent needs. ArcKit's 5 skills are global to the plugin — fine today, but the pattern is useful if we add command-specific reference skills later.8. Vertical-level
.mcp.jsonMCPs declared per vertical (
financial-analysisships FactSet, Daloopa, Morningstar, S&P, Moody's, LSEG, Pitchbook); agents inherit by name. ArcKit declares all 5 MCPs at plugin root. Splitting by jurisdiction overlay (UK gov MCPs vs UAE vs FR vs CA) would let users install only what they need and keepalwaysLoadlean.9. Strong regulatory disclaimer at README top
ArcKit produces DPIAs, EU AI Act assessments, NIS2 conformity, NCSC Secure-by-Design — same regulatory gravity. A prominent "advisory output, requires accountable-officer sign-off" banner would mirror this framing and reduce misuse risk.
Lower-impact / informational
10. Cookbook README convention
Every cookbook README follows the same structure: Overview · Deploy · Steering events · Security & handoffs (tier table). Worth adopting as a template for ArcKit's heavier agents.
11. Dual-runtime guidance in skills
The DCF skill has explicit "if running inside Excel Office Add-in vs generating standalone .xlsx" sections. Analogue for ArcKit: if a skill runs differently under Claude Code vs Codex CLI vs Gemini, document the divergence inline rather than relying on the converter to silently strip.
12.
*.local.mdgitignored user configThey use markdown sidecars; ArcKit uses
userConfiginplugin.json. ArcKit's solution is more structured — no change needed, just noting the difference.Recommended next steps
In rough priority order:
arckit-claude/agents/READER-PATTERN.mdreference; pilot by splittingarckit-research(orarckit-datascout) into reader + orchestrator + writer with a schema-validated handoff between tiersscripts/check_references.pywalking every command/agent/skill, resolving${CLAUDE_PLUGIN_ROOT}/...paths, helper scripts, and template references against disk; wire into CIhandoffs:frontmatter to support anoutput_schema:for the producing command and enforce in any orchestration layertests/steering/<command>.jsonwith canonical inputs for the 10 research-heavy commandsReferences
managed-agent-cookbooks/earnings-reviewer/README.mdsecurity tablescripts/orchestrate.pyscripts/check.pymanaged-agent-cookbooks/earnings-reviewer/subagents/transcript-reader.yaml