coracle

A small, RAM-aware AI gateway that carries you across big AI seas. Fits on a 16GB Mac M1, free-tier friendly.

A personal-machine AI coracle that intelligently splits work between free-tier "big" cloud AI (planning) and local Ollama models (reasoning + execution), without ever spiking RAM enough to crash the machine. Built to be consumed as a drop-in OpenAI-compatible "model" by opencode, Claude Code, codex, Cursor, Continue, etc.

Why this exists

Big AI models are great at planning. Small local models are great at executing. Free API tiers run out. Browser-driven web AIs are flaky. RAM on a 16GB Mac is precious. None of the existing tools combine all of these gracefully — so this one does:

Resident reasoning model (qwen2.5:7b) classifies every request and routes it to the right pipeline.
Big AI (Gemini, Groq, Ollama Cloud, headless-browser fallback to Claude.ai/ChatGPT/Gemini-web) handles deep planning when the classifier asks for it.
Coder model (qwen2.5-coder:7b) executes steps locally with a full tool belt (fs, shell, web, browser, git).
Single-LLM-slot scheduler ensures only one 7B model is in RAM at a time.
SQLite job state powers instant status responses with zero RAM cost.
One model name to the consumer: coracle. Auto-routing is invisible.

Architecture at a glance

opencode / Claude Code / codex
            │  (OpenAI-compatible /v1/chat/completions)
            ▼
┌─────────────────────────────────────────────────────────────┐
│ Resident reasoning model (qwen2.5:7b) — CLASSIFIER          │
│  → fast | deep | research | status                           │
└─────────────────────────────────────────────────────────────┘
            │
   ┌────────┼────────┬─────────────────┐
   ▼        ▼        ▼                 ▼
 status   fast      deep             research
 (DB     (local-   (reason →         (deep + web
  read)  only)     big AI →          tools biased)
                   parse →
                   coder →
                   verify)

Full design details: docs/PLAN.md.

Run with Docker

Multi-arch (linux/amd64 + linux/arm64) images are published to GHCR by the release-image workflow. Two variants:

ghcr.io/skgandikota/coracle — slim runtime, no browser deps.
ghcr.io/skgandikota/coracle-browser — slim + Playwright/Chromium.

docker run --rm -p 8000:8000 \
  -v "$HOME/.config/coracle:/etc/coracle" \
  -v "$HOME/.local/share/coracle:/var/lib/coracle" \
  ghcr.io/skgandikota/coracle:latest

Tags: :latest (newest semver), :vX.Y.Z / :vX.Y / :vX (per release), :edge (head of main). See docs/RELEASES.md for the release process and verification steps.

Integrations

Per-tool how-to guides for plugging coracle into the coding agents that consume it as either an MCP server or an OpenAI-compatible model:

Tool	Guide	Status
Claude Code	`docs/integrations/claude-code.md`	✅ documented
opencode	coming via #23	🚧 placeholder
codex	coming via #25	🚧 placeholder

How is this different from LiteLLM?

Short version: LiteLLM is a paid-API gateway built for throughput; coracle is a personal-machine scheduler built for $0 budgets and a 16GB RAM ceiling. We use LiteLLM's SDK as our provider abstraction, but the product is a different thing entirely — see docs/VS_LITELLM.md for the full table.

	LiteLLM	`coracle`
Cost model	Pay-per-token	$0 — free tiers + local + headless-browser fallback
Topology	Stateless proxy	Stateful job coracle
Inference	Cloud-first	Local-first
RAM target	Server-class	16GB Mac M1
Tool execution	Caller's job	Coracle runs the tools (sandbox + MCP)
Status / progress	None	First-class, never loads an LLM

Status

🚧 Pre-alpha — implementation underway.

Skeleton (package layout, settings loader, structured logging) landed in #31.

Issues are organized into 7 phases (Phase 1 → Phase 7) tracked via GitHub Milestones. Each phase has an Epic issue summarizing scope and linking to its sub-tasks.

This project is agent-friendly: every issue contains enough context, acceptance criteria, file paths, and definition-of-done that a coding agent (or human contributor) can pick it up cold, clone the repo, and submit a PR.

How to contribute (humans and agents)

Pick a ready issue (label: status:ready) — these have no unresolved dependencies.
Read the issue's Context, Acceptance Criteria, and Definition of Done.
Reference docs/PLAN.md for the bigger picture.
Open a PR linking the issue (Closes #N).
Follow CONTRIBUTING.md.
PRs are reviewed by a layered AI bot stack — see docs/REVIEW_BOTS.md. Only our strict code-reviewer-001 bot has merge authority; it waits for the AI bots to weigh in before approving.

Tech stack

Concern	Choice
Language	Python 3.11+
Local models	Ollama (`qwen2.5:7b`, `qwen2.5-coder:7b`)
Big AI providers	`litellm` → Gemini, Groq, Ollama Cloud + Playwright headless fallback
External interface	OpenAI-compatible HTTP (primary) + MCP stdio + native HTTP + CLI
Server	FastAPI + Uvicorn
State	SQLite
Browser	Playwright (headless, separate subprocess per provider)
RAM monitor	psutil

Hardware target

Mac M1 Pro, 16 GB RAM. Designed to never exceed ~11 GB resident.

Wiring external MCP servers

The coracle can consume any number of remote/cloud MCP servers as local tools. Copy the example config and edit it:

cp config/mcp_servers.yaml.example config/mcp_servers.yaml
# edit config/mcp_servers.yaml — supports stdio | http | sse transports
coracle mcp list      # show connected servers + tool counts
coracle mcp reload    # re-read the config without restarting

Environment variables in the config (e.g. ${GITHUB_TOKEN}) are expanded at load time, so secrets stay out of source control.

License

Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

You are free to share and adapt the material under these terms:

Attribution — credit the original author and link to the license.
NonCommercial — no commercial use.
ShareAlike — distribute derivative works under the same license.

See LICENSE for the full legal text.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.gemini		.gemini
.github		.github
config		config
coracle		coracle
docs		docs
evals		evals
scripts		scripts
tests		tests
website		website
.coderabbit.yaml		.coderabbit.yaml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.pr_agent.toml		.pr_agent.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
.semgrep.yml		.semgrep.yml
.sourcery.yaml		.sourcery.yaml
AGENTS.md		AGENTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
Dockerfile.browser		Dockerfile.browser
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
compose.browser.yaml		compose.browser.yaml
compose.init.yaml		compose.init.yaml
compose.yaml		compose.yaml
docker-compose.override.yml.example		docker-compose.override.yml.example
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

coracle

coracle

Why this exists

Architecture at a glance

Run with Docker

Integrations

How is this different from LiteLLM?

Status

How to contribute (humans and agents)

Tech stack

Hardware target

Wiring external MCP servers

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

coracle

coracle

Why this exists

Architecture at a glance

Run with Docker

Integrations

How is this different from LiteLLM?

Status

How to contribute (humans and agents)

Tech stack

Hardware target

Wiring external MCP servers

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages