A meta-agent that generates, tests, and iterates on AI agent harnesses from plain-English descriptions.
Describe what you need. AutoAgent builds the agent, tests it against your sample inputs, diagnoses failures, and improves the harness automatically. No Docker. No benchmarks. No research background required.
AutoAgent pioneered the concept of meta-agents that improve their own harnesses. But it requires Docker, Harbor benchmarks, and a deep understanding of harness engineering concepts like hill-climbing and overfitting prevention.
Accessible AutoAgent brings the same self-improving agent concept to everyone:
| AutoAgent | Accessible AutoAgent | |
|---|---|---|
| Setup | Docker + Harbor + benchmark infra | pip install + API key |
| Input | program.md directives (researcher-oriented) |
Plain English descriptions |
| Templates | None (start from scratch) | 3 pre-built templates |
| Reports | Raw TSV + trace logs | Human-readable markdown reports |
| Audience | AI researchers, benchmark practitioners | Product engineers, technical PMs, developers |
| Cost tracking | None | Per-iteration token usage and cost |
| LLM support | OpenAI-focused | Provider-agnostic (Claude, OpenAI) |
# Install
pip install accessible-autoagent
# Configure
export AUTOAGENT_PROVIDER=anthropic
export ANTHROPIC_API_KEY=your-key
# Generate an agent
autoagent generate "A research agent that summarizes topics with cited sources"
# Iterate on it
autoagent iterate -s scaffold.json -i sample_inputs/ -n 3graph LR
A[Describe Task] --> B[Generator]
B --> C[Agent Scaffold]
C --> D[Runner]
D --> E[Execution Traces]
E --> F[Analyzer]
F --> G{Converged?}
G -->|No| H[Refine Scaffold]
H --> D
G -->|Yes| I[Final Agent + Reports]
- Generate: Describe your task in plain English. The generator creates a complete agent scaffold (system prompt, tools, orchestration logic).
- Run: Execute the scaffold against your sample inputs. Every step is captured in detailed traces.
- Analyze: The analyzer reads traces, identifies failure patterns, and proposes specific improvements.
- Iterate: Changes are applied and the cycle repeats until the target pass rate is reached.
Each iteration produces a human-readable markdown report showing exactly what changed and why.
Start from a template instead of from scratch:
| Template | What it does |
|---|---|
research_agent |
Web research with source evaluation and structured summaries |
code_reviewer |
Code review with bug detection, security analysis, and style checking |
data_analyst |
Data analysis with statistics, correlations, and anomaly detection |
autoagent templates # List all templates
autoagent from-template research_agent # Load oneimport asyncio
from autoagent.core.iterate import IterationLoop
from autoagent.providers.anthropic import AnthropicProvider
async def main():
provider = AnthropicProvider()
loop = IterationLoop(provider)
result = await loop.run(
task_description="An agent that reviews code for security vulnerabilities",
sample_inputs=[
"Check this Python file for SQL injection",
"Review this JS code for XSS risks",
"Audit this Go service for auth bypass",
],
max_iterations=3,
)
print(f"Pass rate: {result.final_pass_rate:.0%}")
print(f"Tokens used: {result.total_tokens:,}")
asyncio.run(main())Install the plugin/ directory as a Claude Code plugin to use AutoAgent through natural language:
- "Generate an agent that summarizes research papers"
- "Iterate on the scaffold with these sample inputs"
- "Show me the iteration reports"
- "Load the data analyst template"
src/autoagent/
core/
generator.py # Creates scaffolds from task descriptions
runner.py # Executes scaffolds, captures traces
analyzer.py # Diagnoses failures, proposes improvements
iterate.py # Orchestrates the generate-run-analyze loop
providers/
base.py # LLMProvider protocol
anthropic.py # Claude adapter
openai.py # OpenAI adapter
templates/ # Pre-built agent templates
output/
report.py # Markdown report generation
trace.py # Structured execution traces
See docs/architecture.md for the full system design.
Contributions welcome. See the docs for architecture details.
# Setup
git clone https://github.com/shaan-ad/accessible-autoagent.git
cd accessible-autoagent
pip install -e ".[dev]"
# Test
pytest -v
# Lint
ruff check src/ tests/MIT