Accessible AutoAgent

A meta-agent that generates, tests, and iterates on AI agent harnesses from plain-English descriptions.

Describe what you need. AutoAgent builds the agent, tests it against your sample inputs, diagnoses failures, and improves the harness automatically. No Docker. No benchmarks. No research background required.

Why This Exists

AutoAgent pioneered the concept of meta-agents that improve their own harnesses. But it requires Docker, Harbor benchmarks, and a deep understanding of harness engineering concepts like hill-climbing and overfitting prevention.

Accessible AutoAgent brings the same self-improving agent concept to everyone:

	AutoAgent	Accessible AutoAgent
Setup	Docker + Harbor + benchmark infra	`pip install` + API key
Input	`program.md` directives (researcher-oriented)	Plain English descriptions
Templates	None (start from scratch)	3 pre-built templates
Reports	Raw TSV + trace logs	Human-readable markdown reports
Audience	AI researchers, benchmark practitioners	Product engineers, technical PMs, developers
Cost tracking	None	Per-iteration token usage and cost
LLM support	OpenAI-focused	Provider-agnostic (Claude, OpenAI)

Quick Start

# Install
pip install accessible-autoagent

# Configure
export AUTOAGENT_PROVIDER=anthropic
export ANTHROPIC_API_KEY=your-key

# Generate an agent
autoagent generate "A research agent that summarizes topics with cited sources"

# Iterate on it
autoagent iterate -s scaffold.json -i sample_inputs/ -n 3

How It Works

graph LR
    A[Describe Task] --> B[Generator]
    B --> C[Agent Scaffold]
    C --> D[Runner]
    D --> E[Execution Traces]
    E --> F[Analyzer]
    F --> G{Converged?}
    G -->|No| H[Refine Scaffold]
    H --> D
    G -->|Yes| I[Final Agent + Reports]

Generate: Describe your task in plain English. The generator creates a complete agent scaffold (system prompt, tools, orchestration logic).
Run: Execute the scaffold against your sample inputs. Every step is captured in detailed traces.
Analyze: The analyzer reads traces, identifies failure patterns, and proposes specific improvements.
Iterate: Changes are applied and the cycle repeats until the target pass rate is reached.

Each iteration produces a human-readable markdown report showing exactly what changed and why.

Pre-built Templates

Start from a template instead of from scratch:

Template	What it does
`research_agent`	Web research with source evaluation and structured summaries
`code_reviewer`	Code review with bug detection, security analysis, and style checking
`data_analyst`	Data analysis with statistics, correlations, and anomaly detection

autoagent templates              # List all templates
autoagent from-template research_agent   # Load one

Python API

import asyncio
from autoagent.core.iterate import IterationLoop
from autoagent.providers.anthropic import AnthropicProvider

async def main():
    provider = AnthropicProvider()
    loop = IterationLoop(provider)

    result = await loop.run(
        task_description="An agent that reviews code for security vulnerabilities",
        sample_inputs=[
            "Check this Python file for SQL injection",
            "Review this JS code for XSS risks",
            "Audit this Go service for auth bypass",
        ],
        max_iterations=3,
    )

    print(f"Pass rate: {result.final_pass_rate:.0%}")
    print(f"Tokens used: {result.total_tokens:,}")

asyncio.run(main())

Claude Code Plugin

Install the plugin/ directory as a Claude Code plugin to use AutoAgent through natural language:

"Generate an agent that summarizes research papers"
"Iterate on the scaffold with these sample inputs"
"Show me the iteration reports"
"Load the data analyst template"

Architecture

src/autoagent/
  core/
    generator.py    # Creates scaffolds from task descriptions
    runner.py       # Executes scaffolds, captures traces
    analyzer.py     # Diagnoses failures, proposes improvements
    iterate.py      # Orchestrates the generate-run-analyze loop
  providers/
    base.py         # LLMProvider protocol
    anthropic.py    # Claude adapter
    openai.py       # OpenAI adapter
  templates/        # Pre-built agent templates
  output/
    report.py       # Markdown report generation
    trace.py        # Structured execution traces

See docs/architecture.md for the full system design.

Contributing

Contributions welcome. See the docs for architecture details.

# Setup
git clone https://github.com/shaan-ad/accessible-autoagent.git
cd accessible-autoagent
pip install -e ".[dev]"

# Test
pytest -v

# Lint
ruff check src/ tests/

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
docs		docs
plugin		plugin
src/autoagent		src/autoagent
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accessible AutoAgent

Why This Exists

Quick Start

How It Works

Pre-built Templates

Python API

Claude Code Plugin

Architecture

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Accessible AutoAgent

Why This Exists

Quick Start

How It Works

Pre-built Templates

Python API

Claude Code Plugin

Architecture

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages