From a goal to a task DAG, automatically.
TypeScript-native multi-agent orchestration. Three runtime dependencies.
9 native LLM adapters · MCP · token budgets · retries · context compaction · live tracing.
English · 中文
open-multi-agent is a multi-agent orchestration framework for TypeScript backends. Give it a goal; a coordinator agent decomposes it into a task DAG, parallelizes independents, and synthesizes the result. Three runtime dependencies, drops into any Node.js backend.
Your engineers describe the goal, not the graph.
A typical run, streamed live through onProgress:
agent_start coordinator
task_start design-api
task_complete design-api
task_start implement-handlers
task_start scaffold-tests // independent tasks run in parallel
task_complete scaffold-tests
task_complete implement-handlers
task_start review-code // unblocked after implementation
task_complete review-code
agent_complete coordinator // synthesizes final result
Success: true
Tokens: 12847 output tokens
| Capability | What you get |
|---|---|
| Goal-driven coordinator | One runTeam(team, goal) call. The coordinator decomposes the goal into a task DAG, parallelizes independents, and synthesizes the result. |
| Mix providers in one team | 9 native: Anthropic, OpenAI, Azure, Gemini, Grok, DeepSeek, MiniMax, Qiniu, Copilot. Ollama / vLLM / LM Studio / OpenRouter / Groq via OpenAI-compatible. (full list) |
| Tools + MCP | 6 built-in (bash, file_*, grep, glob), opt-in delegate_to_agent, custom tools via defineTool() + Zod, any MCP server via connectMCPTools(). |
| Streaming + structured output | Token-by-token streaming on every adapter; Zod-validated final answer with auto-retry on parse failure. (structured-output) |
| Observability | onProgress events, onTrace spans, post-run HTML dashboard rendering the executed task DAG. (trace-observability) |
| Pluggable shared memory | Default in-process KV; swap in Redis / Postgres / your own backend by implementing MemoryStore. |
Production controls (context strategies, task retry with backoff, loop detection, tool output truncation/compression) are covered in the Production Checklist.
Requires Node.js >= 18.
Clone, install, run.
git clone https://github.com/JackChen-me/open-multi-agent && cd open-multi-agent
npm install
export ANTHROPIC_API_KEY=sk-...
npx tsx examples/basics/team-collaboration.tsThree agents (architect, developer, reviewer) collaborate on a REST API in /tmp/express-api/. You watch the coordinator decompose the goal and run independent tasks in parallel as the progress events stream in.
Local models via Ollama need no API key, see providers/ollama. For other providers (OPENAI_API_KEY, GEMINI_API_KEY, etc.), check Supported Providers.
npm install @jackchen_me/open-multi-agentThree agents, one goal. The framework handles the rest:
import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
import type { AgentConfig } from '@jackchen_me/open-multi-agent'
const architect: AgentConfig = {
name: 'architect',
model: 'claude-sonnet-4-6',
systemPrompt: 'You design clean API contracts and file structures.',
tools: ['file_write'],
}
const developer: AgentConfig = {
name: 'developer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You implement what the architect specifies. Write clean, runnable TypeScript.',
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You review code for correctness, security, and clarity.',
tools: ['file_read', 'grep'],
}
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
onProgress: (event) => console.log(event.type, event.task ?? event.agent ?? ''),
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
})
// Describe a goal. The framework breaks it into tasks and orchestrates execution
const result = await orchestrator.runTeam(team, 'Create a REST API for a todo list in /tmp/todo-api/')
console.log(`Success: ${result.success}`)
console.log(`Tokens: ${result.totalTokenUsage.output_tokens} output tokens`)| Mode | Method | When to use | Example |
|---|---|---|---|
| Single agent | runAgent() |
One agent, one prompt. Simplest entry point | basics/single-agent |
| Auto-orchestrated team | runTeam() |
Give a goal, framework plans and executes | basics/team-collaboration |
| Explicit pipeline | runTasks() |
You define the task graph and assignments | basics/task-pipeline |
For MapReduce-style fan-out without task dependencies, use AgentPool.runParallel() directly. See patterns/fan-out-aggregate.
For shell and CI, the package exposes a JSON-first binary. See docs/cli.md for oma run, oma task, oma provider, exit codes, and file formats.
examples/ is organized by category: basics, cookbook, patterns, providers, integrations, and production. See examples/README.md for the full index.
Real-world workflows (cookbook/)
End-to-end scenarios you can run today. Each one is a complete, opinionated workflow.
contract-review-dag: four-task DAG for contract review with parallel branches and step-level retry on failure.meeting-summarizer: three specialised agents fan out on a transcript, an aggregator merges them into one Markdown report with action items and sentiment.competitive-monitoring: three parallel source agents extract claims from feeds; an aggregator cross-checks them and flags contradictions.translation-backtranslation: translate EN to target with one provider, back-translate with another, flag semantic drift.
basics/team-collaboration:runTeam()coordinator pattern.patterns/structured-output: any agent returns Zod-validated JSON.patterns/agent-handoff: synchronous sub-agent delegation viadelegate_to_agent.integrations/trace-observability:onTracespans for LLM calls, tools, and tasks.integrations/mcp-github: expose an MCP server's tools to an agent viaconnectMCPTools().integrations/with-vercel-ai-sdk: Next.js app combining OMArunTeam()with AI SDKuseChatstreaming.- Provider examples: three-agent teams under
examples/providers/, including hosted providers, OpenAI-compatible endpoints, and local models.
Run any script with npx tsx examples/<path>.ts.
A quick router. Mechanism breakdown follows.
| If you need | Pick |
|---|---|
| Fixed production topology with mature checkpointing | LangGraph JS |
| Explicit Supervisor + hand-wired workflows | Mastra |
| Python stack with mature multi-agent ecosystem | CrewAI |
| Single-agent LLM call layer for 60+ providers | Vercel AI SDK |
| TypeScript, goal to result with auto task decomposition | open-multi-agent |
vs. LangGraph JS. LangGraph compiles a declarative graph (nodes, edges, conditional routing) into an invokable. open-multi-agent runs a Coordinator that decomposes the goal into a task DAG at runtime, then auto-parallelizes independents. Same end (orchestrated execution), opposite directions: LangGraph is graph-first, OMA is goal-first.
vs. Mastra. Both are TypeScript-native. Mastra's Supervisor pattern requires you to wire agents and workflows by hand; OMA's Coordinator does the wiring at runtime from the goal string. If the workflow is known up front, Mastra's explicitness pays off. If you'd rather not enumerate every step, OMA's runTeam(team, goal) is one call.
vs. CrewAI. CrewAI is the mature multi-agent option in Python. OMA targets TypeScript backends with three runtime dependencies and direct Node.js embedding. Roughly comparable orchestration surface; the choice is language stack.
vs. Vercel AI SDK. AI SDK is the LLM call layer (unified client for 60+ providers, streaming, tool calls, structured outputs). It does not orchestrate multi-agent teams. The two compose: AI SDK for single-agent work, OMA when you need a team.
open-multi-agent launched 2026-04-01 under MIT. Public users and integrations to date:
- temodar-agent (~60 stars). WordPress security analysis platform by Ali Sünbül. Uses our built-in tools (
bash,file_*,grep) directly in its Docker runtime. Confirmed production use. - Cybersecurity SOC (home lab). A private setup running Qwen 2.5 + DeepSeek Coder entirely offline via Ollama, building an autonomous SOC pipeline on Wazuh + Proxmox. Early user, not yet public.
Using open-multi-agent in production or a side project? Open a discussion and we will list it here.
- Engram — "Git for AI memory." Syncs knowledge across agents instantly and flags conflicts. (repo)
- @agentsonar/oma — Sidecar detecting cross-run delegation cycles, repetition, and rate bursts.
Built an integration? Open a discussion to get listed.
For products and platforms with a deep open-multi-agent integration. See the Featured partner program for terms and how to apply.
┌─────────────────────────────────────────────────────────────────┐
│ OpenMultiAgent (Orchestrator) │
│ │
│ createTeam() runTeam() runTasks() runAgent() getStatus() │
└──────────────────────┬──────────────────────────────────────────┘
│
┌──────────▼──────────┐
│ Team │
│ - AgentConfig[] │
│ - MessageBus │
│ - TaskQueue │
│ - SharedMemory │
└──────────┬──────────┘
│
┌─────────────┴─────────────┐
│ │
┌────────▼──────────┐ ┌───────────▼───────────┐
│ AgentPool │ │ TaskQueue │
│ - Semaphore │ │ - dependency graph │
│ - runParallel() │ │ - auto unblock │
└────────┬──────────┘ │ - cascade failure │
│ └───────────────────────┘
┌────────▼──────────┐
│ Agent │
│ - run() │ ┌────────────────────────┐
│ - prompt() │───►│ LLMAdapter │
│ - stream() │ │ - AnthropicAdapter │
└────────┬──────────┘ │ - OpenAIAdapter │
│ │ - AzureOpenAIAdapter │
│ │ - CopilotAdapter │
│ │ - GeminiAdapter │
│ │ - GrokAdapter │
│ │ - MiniMaxAdapter │
│ │ - DeepSeekAdapter │
│ │ - QiniuAdapter │
│ └────────────────────────┘
┌────────▼──────────┐
│ AgentRunner │ ┌──────────────────────┐
│ - conversation │───►│ ToolRegistry │
│ loop │ │ - defineTool() │
│ - tool dispatch │ │ - 6 built-in tools │
└───────────────────┘ │ + delegate (opt-in) │
└──────────────────────┘
Three layers of telemetry, each independently consumable.
| Layer | What it gives you | Where to wire it |
|---|---|---|
onProgress |
Per-task lifecycle events: task_start, task_complete, task_retry, task_skipped, agent_start, agent_complete, budget_exceeded, error. Lightweight, sync. |
OrchestratorConfig.onProgress; pipe to your logger or a live dashboard. |
onTrace |
Structured spans for LLM calls, tool executions, and tasks. Each span carries parent IDs, durations, token counts, and tool I/O. | OrchestratorConfig.onTrace; forward to OpenTelemetry, Datadog, Honeycomb, Langfuse, etc. (integrations/trace-observability) |
| Post-run HTML dashboard | Static HTML page rendering the executed task DAG with timing, token usage, and per-task status. No server, no D3, just string. |
import { renderTeamRunDashboard } from '@jackchen_me/open-multi-agent', then fs.writeFileSync('run.html', renderTeamRunDashboard(result)). |
Together: live progress for ops, traces for debugging and cost attribution, a shareable post-mortem artifact for any run.
| Tool | Description |
|---|---|
bash |
Execute shell commands. Returns stdout + stderr. Supports timeout and cwd. |
file_read |
Read file contents at an absolute path. Supports offset/limit for large files. |
file_write |
Write or create a file. Auto-creates parent directories. |
file_edit |
Edit a file by replacing an exact string match. |
grep |
Search file contents with regex. Uses ripgrep when available, falls back to Node.js. |
glob |
Find files by glob pattern. Returns matching paths sorted by modification time. |
- Pick a preset.
toolPreset: 'readonly' | 'readwrite' | 'full'covers most agents. - Narrow further. Combine
tools(allowlist) anddisallowedTools(denylist) on top of the preset. - Bring your own.
defineTool()+customTools, oragent.addTool()at runtime. - Cap output cost.
outputSchema,maxToolOutputChars, andcompressToolResults. - MCP. Connect external servers via
connectMCPTools()fromopen-multi-agent/mcp.
Full details in docs/tool-configuration.md.
Teams can share a namespaced key-value store so later agents see earlier agents' findings. Use sharedMemory: true for the default in-process store, or implement MemoryStore and pass it via sharedMemoryStore for Redis, Postgres, Engram, etc. Keys are namespaced as <agentName>/<key> before they reach the store. SDK-only: the CLI cannot pass runtime objects.
Long-running agents hit input token ceilings fast. AgentConfig.contextStrategy controls how the conversation shrinks:
sliding-window: keep the last N turns, drop the rest. Cheapest.summarize: send old turns to a summary model, keep the summary in place.compact: rule-based truncation, no extra LLM call.custom: supply your owncompress(messages, estimatedTokens).
See docs/context-management.md.
Change provider, model, and set the env var. The agent config shape stays the same:
const agent: AgentConfig = {
name: 'my-agent',
provider: 'anthropic',
model: 'claude-sonnet-4-6',
systemPrompt: 'You are a helpful assistant.',
}The framework ships a wired-in provider name for each of these. You set provider and the env var, that's it.
Under the hood, Anthropic and Gemini use their own SDKs; the rest are pre-configured shortcuts on top of the OpenAI Chat Completions protocol. Same wire format as the second table, the framework just wrote the
baseURLfor you.
| Provider | Config | Env var | Example model | Notes |
|---|---|---|---|---|
| Anthropic (Claude) | provider: 'anthropic' |
ANTHROPIC_API_KEY |
claude-sonnet-4-6 |
Native Anthropic SDK. |
| Gemini | provider: 'gemini' |
GEMINI_API_KEY |
gemini-2.5-pro |
Native Google GenAI SDK. Requires npm install @google/genai. |
| OpenAI (GPT) | provider: 'openai' |
OPENAI_API_KEY |
gpt-4o |
|
| Azure OpenAI | provider: 'azure-openai' |
AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT |
gpt-4 |
Optional AZURE_OPENAI_API_VERSION, AZURE_OPENAI_DEPLOYMENT. |
| GitHub Copilot | provider: 'copilot' |
GITHUB_COPILOT_TOKEN (falls back to GITHUB_TOKEN) |
gpt-4o |
Custom token-exchange flow on top of OpenAI protocol. |
| Grok (xAI) | provider: 'grok' |
XAI_API_KEY |
grok-4 |
OpenAI-compatible; endpoint is api.x.ai/v1. |
| DeepSeek | provider: 'deepseek' |
DEEPSEEK_API_KEY |
deepseek-chat |
OpenAI-compatible. deepseek-chat (V3, coding) or deepseek-reasoner (thinking mode). |
| MiniMax (global) | provider: 'minimax' |
MINIMAX_API_KEY |
MiniMax-M2.7 |
OpenAI-compatible. |
| MiniMax (China) | provider: 'minimax' + MINIMAX_BASE_URL |
MINIMAX_API_KEY |
MiniMax-M2.7 |
Set MINIMAX_BASE_URL=https://api.minimaxi.com/v1. |
| Qiniu | provider: 'qiniu' |
QINIU_API_KEY |
deepseek-v3 |
OpenAI-compatible. Endpoint https://api.qnaigc.com/v1; multiple model families, see Qiniu AI docs. |
No bundled shortcut, but it works the same. Use provider: 'openai' and point baseURL at any server that speaks OpenAI Chat Completions.
| Service | Config | Env var | Example model |
|---|---|---|---|
| Ollama (local) | provider: 'openai' + baseURL: 'http://localhost:11434/v1' |
none | llama3.1 |
| vLLM (local) | provider: 'openai' + baseURL |
none | (server-loaded) |
| LM Studio (local) | provider: 'openai' + baseURL |
none | (server-loaded) |
| llama.cpp server (local) | provider: 'openai' + baseURL |
none | (server-loaded) |
| OpenRouter | provider: 'openai' + baseURL: 'https://openrouter.ai/api/v1' + apiKey |
OPENROUTER_API_KEY |
openai/gpt-4o-mini |
| Groq | provider: 'openai' + baseURL: 'https://api.groq.com/openai/v1' |
GROQ_API_KEY |
llama-3.3-70b-versatile |
Mistral, Qwen, Moonshot, Doubao, Together AI, Fireworks, etc. all plug in the same way. For services where the key is not OPENAI_API_KEY (OpenRouter is one), pass it explicitly via the apiKey config field; otherwise the openai adapter falls back to OPENAI_API_KEY from the environment.
The framework supports tool-calling with local models served by Ollama, vLLM, LM Studio, or llama.cpp. Tool-calling is handled natively by these servers via the OpenAI-compatible API.
Verified models: Gemma 4, Llama 3.1, Qwen 3, Mistral, Phi-4. See the full list at ollama.com/search?c=tools.
Fallback extraction: If a local model returns tool calls as text instead of using the tool_calls wire format (common with thinking models or misconfigured servers), the framework automatically extracts them from the text output.
Timeout: Local inference can be slow. Use timeoutMs on AgentConfig to prevent indefinite hangs:
const localAgent: AgentConfig = {
name: 'local',
model: 'llama3.1',
provider: 'openai',
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama',
tools: ['bash', 'file_read'],
timeoutMs: 120_000, // abort after 2 minutes
}Quantized model tuning. Highly quantized MoE models on consumer hardware (Qwen2.5-MoE @ Q4, DeepSeek-MoE @ Q4, etc.) tend to fall into repetition loops or hallucinate tool-call schemas under default sampling. AgentConfig exposes topK, minP, frequencyPenalty, presencePenalty, parallelToolCalls (set false to force one tool call per turn on shaky tool-callers), and an extraBody escape hatch for server-specific knobs (e.g. vLLM's repetition_penalty). Cloud OpenAI users do not need these — defaults are tuned for full-precision models. See providers/local-quantized for a full setup.
Troubleshooting:
- Model not calling tools? Ensure it appears in Ollama's Tools category. Not all models support tool-calling.
- Using Ollama? Update to the latest version (
ollama update). Older versions have known tool-calling bugs. - Proxy interfering? Use
no_proxy=localhostwhen running against local servers.
Before going live, wire up the controls that protect token spend, recover from failure, and let you debug.
| Concern | Knob | Where it lives |
|---|---|---|
| Bound the conversation | maxTurns per agent + contextStrategy (sliding-window / summarize / compact / custom) |
AgentConfig |
| Cap tool output | maxToolOutputChars (or per-tool maxOutputChars) + compressToolResults: true |
AgentConfig and defineTool() |
| Recover from failure | Per-task maxRetries, retryDelayMs, retryBackoff (exponential multiplier) |
Task config used via runTasks() |
| Hard-cap spend | maxTokenBudget on the orchestrator |
OrchestratorConfig |
| Catch stuck agents | loopDetection with onLoopDetected: 'terminate' (or a custom handler) |
AgentConfig |
| Trace and audit | onTrace to your tracing backend; persist renderTeamRunDashboard(result) |
OrchestratorConfig |
Issues, feature requests, and PRs are welcome. Some areas where contributions would be especially valuable:
- Production examples. Real-world end-to-end workflows. See
examples/production/README.mdfor the acceptance criteria and submission format. - Documentation. Guides, tutorials, and API docs.
- Translations. Help translate this README into other languages. Open a PR.
Project lead: Jack Chen
Framework features
- @ibrahimkzmv (token budget, context strategy, dependency-scoped context, tool presets, glob, MCP integration, configurable coordinator, CLI, dashboard rendering)
- @apollo-mg (context compaction fix, sampling parameters)
- @tizerluo (onPlanReady, onAgentStream)
- @Xin-Mai (output schema validation)
- @JasonOA888 (AbortSignal support)
- @EchoOfZion (coordinator skip for simple goals)
- @voidborne-d (OpenAI mixed content fix)
- @hamzarstar (agent delegation co-author)
Provider integrations
- @ibrahimkzmv (Gemini)
- @hkalex (DeepSeek, MiniMax)
- @marceloceccon (Grok)
- @Klarline (Azure OpenAI)
- @Deathwing (GitHub Copilot)
- @JackChiang233 and @jiangzhuo (Qiniu)
Examples & cookbook
- @mvanhorn (research aggregation, code review, meeting summarizer, Groq example)
- @Kinoo0 (code review upgrade)
- @Optimisttt (research aggregation upgrade)
- @Agentscreator (Engram memory integration)
- @fault-segment and yanzizheng (contract-review DAG)
- @HuXiangyu123 (cost-tiered example)
- @zouhh22333-beep (translation/backtranslation)
- @pei-pei45 (competitive monitoring)
Docs & tests
- @tmchow (llama.cpp docs)
- @kenrogers (OpenRouter docs)
- @jadegold55 (LLM adapter test coverage)
MIT