Configurable Agent

A configurable LLM agent service. One YAML file defines the agent's system prompt, model, tools, and safety knobs. It exposes an HTTP endpoint that streams the agent loop back to clients over Server-Sent Events.

Built to be dropped into Kubernetes — the YAML lives in a ConfigMap, provider API keys live in a Secret.

Stack

Node 22 · TypeScript (ESM) · Hono · Vercel AI SDK · Zod · pino · OpenTelemetry · Biome · Vitest. Package manager: pnpm.

Providers: Anthropic, OpenAI, Google, and any OpenAI-compatible endpoint (including local ollama) via @ai-sdk/openai-compatible.

Quick start (local)

pnpm install
export ANTHROPIC_API_KEY=...         # or OPENAI_API_KEY / GOOGLE_GENERATIVE_AI_API_KEY
CONFIG_PATH=./example.config.yaml pnpm dev

Then:

curl -N -X POST http://localhost:3000/invoke \
  -H 'content-type: application/json' \
  -d '{"messages":[{"role":"user","content":"search the web for eggai and summarize"}]}'

Configuration

Loaded once at startup from CONFIG_PATH (default /etc/configurable-agent/config.yaml). The process exits non-zero if the file is invalid.

systemPrompt: |
  You are a helpful assistant...

model:
  provider: anthropic           # anthropic | openai | google | ollama
  name: claude-sonnet-4-6
  # baseUrl: http://host.docker.internal:11434/v1   # required for ollama
  temperature: 0.2
  # topP, maxOutputTokens also supported

agent:
  maxSteps: 10                  # hard cap on the tool-use loop

mcpTools:                       # external MCP servers (none bundled — see Built-in tools below)
  - name: accounts
    transport: stdio
    command: accounts-mcp
    args: []
    env:
      ACCOUNTS_URL: http://accounts:8080
  # - name: files
  #   transport: http
  #   url: https://files.internal/mcp
  #   headers:
  #     X-Tenant: acme

safety:
  compaction:                   # before each LLM call
    triggerTokens: 100000
    keepRecentMessages: 6
  toolOutput:                   # after each tool call
    triggerTokens: 4000
    headChars: 500
    tailChars: 500

output:
  structured: false
  # When true, the final SSE event includes a `structured` field validated
  # against the JSON Schema below:
  # structured: true
  # schema:
  #   type: object
  #   properties:
  #     answer: { type: string }
  #     confidence: { type: number }
  #   required: [answer]

Built-in tools

The agent always has access to one built-in tool regardless of mcpTools configuration:

Tool	Purpose
`todowrite`	Maintains an in-memory todo list for the duration of a single run. Each call replaces the entire list. Use it to break complex requests into steps and track progress (`pending` → `in_progress` → `completed`). The store is reset between requests.

All other tools are provided externally via MCP servers configured under mcpTools.

HTTP API

Route	Method	Purpose
`/health`	GET	Liveness — always 200 once the process is up.
`/ready`	GET	Readiness — 200 when the config is loaded and required provider keys are present. The MCP tool registry is validated at startup, so if discovery or a tool-name conflict fails, the process exits non-zero before this endpoint is ever reachable.
`/invoke`	POST	Run the agent and stream events via SSE.

Request

{ "messages": [{ "role": "user", "content": "..." }] }

Roles: system | user | assistant. Caller-provided system messages are stripped and replaced with the configured systemPrompt.

SSE event taxonomy

event: reasoning          data: { text }
event: content_delta      data: { text }
event: tool_call          data: { id, name, args }
event: tool_result        data: { id, output: ToolResult }
event: compaction_start   data: { before: { tokens, messages } }
event: compaction_finished data: { before, after, droppedCount }
event: final              data: { content, structured?, stopReason, steps, truncated }
event: error              data: { code, message, details? }

The tool_result.output payload is a ToolResult envelope:

{
  label: string,
  status: 'succeeded' | 'error' | 'denied' | 'approval_required',
  content: string,         // post-summarization for oversized results
  return_code: number | null,
  args: unknown,
  duration_ms: number,
  truncated?: boolean,     // true when content is the summary + head/tail
}

Truncation is signalled in-band via output.truncated: true; there is no separate tool_result_truncated event.

Each step is a single LLM call. Parallel tool calls within one step are supported and emit concurrent tool_call / tool_result pairs. Closing the HTTP connection aborts the loop server-side.

Last-step guarantee

On the final step (maxSteps), the agent sends the model toolChoice: 'none', forcing a natural-language answer. If the model still hallucinates a tool call anyway, an error event with code: "tool_call_on_final_step" is emitted and the stream closes.

Safety features

Feature	Trigger	Action	Event(s)
Conversation compaction	`countMessagesTokens(messages) > safety.compaction.triggerTokens`	LLM-summarize earlier turns; keep `keepRecentMessages` verbatim	`compaction_start`, `compaction_finished`
Tool output summarization	A tool returns output whose token count exceeds `safety.toolOutput.triggerTokens`	Replace `output.content` with an LLM summary plus head/tail excerpts and set `output.truncated: true`; the summarized form, not the raw output, is what the next reasoning step sees	`tool_result` (with `output.truncated: true`)
Startup-time MCP validation	Service start	Connect to every configured MCP server, list tools, and reject duplicate tool names. Initialization failure is fatal — the process exits non-zero before accepting traffic	—

Token counts use gpt-tokenizer (o200k_base). This is an approximation for Anthropic/Google — it generally over-counts, which is safe for threshold checks.

Development

pnpm dev             # tsx watch
pnpm test            # vitest
pnpm typecheck       # tsc --noEmit
pnpm lint            # biome check
pnpm lint:fix        # biome check --write
pnpm build           # tsc -> dist/
pnpm start           # node dist/index.js

Docker

docker build -t eggai-configurable-agent:latest .
docker run --rm \
  -e ANTHROPIC_API_KEY=... \
  -e TAVILY_API_KEY=... \
  -v "$PWD/example.config.yaml:/etc/configurable-agent/config.yaml:ro" \
  -p 3000:3000 \
  eggai-configurable-agent:latest

Kubernetes

Manifests in k8s/:

configmap.yaml — the agent's YAML, mounted at /etc/configurable-agent/config.yaml
secret.example.yaml — template for provider keys consumed via envFrom
deployment.yaml — hardened pod spec (non-root, read-only rootfs, dropped caps)
service.yaml — ClusterIP on port 80

kubectl create namespace configurable-agent
kubectl -n configurable-agent create secret generic configurable-agent-provider-keys \
  --from-literal=ANTHROPIC_API_KEY=... \
  --from-literal=TAVILY_API_KEY=...
kubectl -n configurable-agent apply -f k8s/

Real deployments should not commit keys — populate configurable-agent-provider-keys via Vault Secrets Operator, External Secrets Operator, Vault Agent Injector, or another secret-sync mechanism. The pod stays Vault-agnostic and only reads env vars.

Local ollama from a kind pod

k8s/deployment.yaml includes a hostAliases entry mapping host.docker.internal → 172.23.0.1 (the kind network gateway on Linux), so a pod can reach an ollama running on the developer's laptop. Point the config at it:

model:
  provider: ollama
  name: gemma4:31b
  baseUrl: http://host.docker.internal:11434/v1

Ensure ollama is listening on 0.0.0.0:11434 (e.g. via OLLAMA_HOST=0.0.0.0).

Observability

Logs: pino to stdout. LOG_LEVEL env var controls verbosity.
Traces: OpenTelemetry SDK auto-starts when OTEL_EXPORTER_OTLP_ENDPOINT (or OTEL_ENABLED) is set. HTTP and fetch are auto-instrumented.

Repository rules

See CLAUDE.md.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
.qualops		.qualops
docs/specs		docs/specs
k8s		k8s
mo-evals		mo-evals
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
biome.json		biome.json
eval.config.yaml		eval.config.yaml
example.config.yaml		example.config.yaml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Configurable Agent

Stack

Quick start (local)

Configuration

Built-in tools

HTTP API

Request

SSE event taxonomy

Last-step guarantee

Safety features

Development

Docker

Kubernetes

Local ollama from a kind pod

Observability

Repository rules

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Configurable Agent

Stack

Quick start (local)

Configuration

Built-in tools

HTTP API

Request

SSE event taxonomy

Last-step guarantee

Safety features

Development

Docker

Kubernetes

Local ollama from a kind pod

Observability

Repository rules

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages