Skip to content

shaan-ad/accessible-autoagent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Accessible AutoAgent

MIT License Python 3.11+ CI

A meta-agent that generates, tests, and iterates on AI agent harnesses from plain-English descriptions.

Describe what you need. AutoAgent builds the agent, tests it against your sample inputs, diagnoses failures, and improves the harness automatically. No Docker. No benchmarks. No research background required.

Why This Exists

AutoAgent pioneered the concept of meta-agents that improve their own harnesses. But it requires Docker, Harbor benchmarks, and a deep understanding of harness engineering concepts like hill-climbing and overfitting prevention.

Accessible AutoAgent brings the same self-improving agent concept to everyone:

AutoAgent Accessible AutoAgent
Setup Docker + Harbor + benchmark infra pip install + API key
Input program.md directives (researcher-oriented) Plain English descriptions
Templates None (start from scratch) 3 pre-built templates
Reports Raw TSV + trace logs Human-readable markdown reports
Audience AI researchers, benchmark practitioners Product engineers, technical PMs, developers
Cost tracking None Per-iteration token usage and cost
LLM support OpenAI-focused Provider-agnostic (Claude, OpenAI)

Quick Start

# Install
pip install accessible-autoagent

# Configure
export AUTOAGENT_PROVIDER=anthropic
export ANTHROPIC_API_KEY=your-key

# Generate an agent
autoagent generate "A research agent that summarizes topics with cited sources"

# Iterate on it
autoagent iterate -s scaffold.json -i sample_inputs/ -n 3

How It Works

graph LR
    A[Describe Task] --> B[Generator]
    B --> C[Agent Scaffold]
    C --> D[Runner]
    D --> E[Execution Traces]
    E --> F[Analyzer]
    F --> G{Converged?}
    G -->|No| H[Refine Scaffold]
    H --> D
    G -->|Yes| I[Final Agent + Reports]
Loading
  1. Generate: Describe your task in plain English. The generator creates a complete agent scaffold (system prompt, tools, orchestration logic).
  2. Run: Execute the scaffold against your sample inputs. Every step is captured in detailed traces.
  3. Analyze: The analyzer reads traces, identifies failure patterns, and proposes specific improvements.
  4. Iterate: Changes are applied and the cycle repeats until the target pass rate is reached.

Each iteration produces a human-readable markdown report showing exactly what changed and why.

Pre-built Templates

Start from a template instead of from scratch:

Template What it does
research_agent Web research with source evaluation and structured summaries
code_reviewer Code review with bug detection, security analysis, and style checking
data_analyst Data analysis with statistics, correlations, and anomaly detection
autoagent templates              # List all templates
autoagent from-template research_agent   # Load one

Python API

import asyncio
from autoagent.core.iterate import IterationLoop
from autoagent.providers.anthropic import AnthropicProvider

async def main():
    provider = AnthropicProvider()
    loop = IterationLoop(provider)

    result = await loop.run(
        task_description="An agent that reviews code for security vulnerabilities",
        sample_inputs=[
            "Check this Python file for SQL injection",
            "Review this JS code for XSS risks",
            "Audit this Go service for auth bypass",
        ],
        max_iterations=3,
    )

    print(f"Pass rate: {result.final_pass_rate:.0%}")
    print(f"Tokens used: {result.total_tokens:,}")

asyncio.run(main())

Claude Code Plugin

Install the plugin/ directory as a Claude Code plugin to use AutoAgent through natural language:

  • "Generate an agent that summarizes research papers"
  • "Iterate on the scaffold with these sample inputs"
  • "Show me the iteration reports"
  • "Load the data analyst template"

Architecture

src/autoagent/
  core/
    generator.py    # Creates scaffolds from task descriptions
    runner.py       # Executes scaffolds, captures traces
    analyzer.py     # Diagnoses failures, proposes improvements
    iterate.py      # Orchestrates the generate-run-analyze loop
  providers/
    base.py         # LLMProvider protocol
    anthropic.py    # Claude adapter
    openai.py       # OpenAI adapter
  templates/        # Pre-built agent templates
  output/
    report.py       # Markdown report generation
    trace.py        # Structured execution traces

See docs/architecture.md for the full system design.

Contributing

Contributions welcome. See the docs for architecture details.

# Setup
git clone https://github.com/shaan-ad/accessible-autoagent.git
cd accessible-autoagent
pip install -e ".[dev]"

# Test
pytest -v

# Lint
ruff check src/ tests/

License

MIT

About

A meta-agent that generates, tests, and iterates on AI agent harnesses from plain-English descriptions. No Docker, no benchmarks required.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages