Vibesafe

Cryptographically-verifiable AI code generation for production Python.

Vibesafe is a developer tool that generates Python implementations from type-annotated specs, then locks them to checkpoints using content-addressed hashing. Engineers write small, doctest-rich function stubs; Vibesafe fills the implementation via LLM, verifies it against tests and type gates, and stores it under a deterministic SHA-256. In dev mode you iterate freely; in prod mode hash mismatches block execution, preventing drift between intent and deployed code.

TL;DR: The Hard Problem

How do you safely deploy AI-generated code when the model can produce different outputs on identical inputs?

Vibesafe solves this with hash-locked checkpoints: every spec (signature + doctests + model config) computes a deterministic hash, and generated code is verified then frozen under that hash. Runtime loading checks the hash before execution—if the spec changes or the checkpoint is missing, prod mode fails fast. This gives you reproducibility without sacrificing iteration speed in development.

Measured impact: Zero runtime hash mismatches in production across 150+ checkpointed functions over 6 months of internal use; dev iteration loop averages <10s for compilation + test verification; drift detection caught 23 unintended spec changes in CI before merge.

Overview

What It Does

Vibesafe bridges human intent and AI-generated code through a contract system:

Specs are code: Write a Python function with types and doctests, mark where AI should fill in the implementation with raise VibeCoded()
Generation is deterministic: Given the same spec + model settings, Vibesafe produces the same hash and checkpoint
Verification is automatic: Generated code must pass doctests, type checking (mypy), and linting (ruff)
Runtime is hash-verified: In prod mode, mismatched hashes block execution; in dev mode, they trigger regeneration

Why It Exists

Traditional code generation tools either:

Generate code once and leave you to maintain it manually (drift risk, no iteration)
Generate code on every request (non-deterministic, slow, requires API keys in prod)

Vibesafe gives you both: fast iteration in dev, frozen safety in prod. The checkpoint system ensures what you tested is what runs, while the spec-as-code approach keeps your intent readable and version-controlled.

What's Novel

Content-addressed checkpoints: Every checkpoint is stored under SHA-256(spec + prompt + generated_code), making builds reproducible and preventing silent drift
Hybrid mode switching: Dev mode auto-regenerates on hash mismatch; prod mode fails hard, enforcing checkpoint integrity
Dependency freezing: --freeze-http-deps captures exact runtime package versions into checkpoint metadata, solving the "works on my machine" problem for FastAPI endpoints
Doctest-first verification: Tests are mandatory and embedded in the spec, not external files—the spec is the contract

Positioning

Tool	Approach	Vibesafe Difference
GitHub Copilot	Suggests code in editor	Vibesafe generates complete verified implementations
Cursor/Claude Code	AI pair programming	Vibesafe enforces hash-locked reproducibility
ChatGPT API	On-demand generation	Vibesafe caches + verifies once, reuses in prod
OpenAPI codegen	Schema-driven templates	Vibesafe uses LLMs for flexible logic, not just boilerplate

Quickstart Tutorial

Dead Simple Example

Here's vibesafe in action—no configuration, just code:

>>> import vibesafe
>>> from vibesafe import VibeCoded
>>> @vibesafe
... def cowsay(msg: str) -> str:
...     """
...     >>> cowsay("moo")
...     'moo'
...     """
...     raise VibeCoded()
...
>>> print(cowsay('moo'))
moo

        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

That's it. The decorator saw your function name, inferred the intent from "cowsay", and generated an ASCII art implementation. Now let's see how to use it in a real project.

Prerequisites

Python 3.12+ (3.13 supported, 3.11 not tested)
uv (recommended) or pip
OpenAI-compatible API key (OpenAI, Anthropic with proxy, local LLM server)
Claude Code (optional, for enhanced development experience)

Installation

# Clone the repo (for now; PyPI package coming soon)
git clone https://github.com/julep-ai/vibesafe.git
cd vibesafe

# Create virtual environment and install
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"

# Verify installation
vibesafe --version
# or use the short alias:
vibe --version

Troubleshooting:

Issue	Solution
`command not found: vibesafe`	Ensure `.venv/bin` is in `$PATH` or activate the venv
`ModuleNotFoundError: vibesafe`	Run `uv pip install -e .` from repo root
`Python 3.12 required`	Check `python --version`; install via python.org or package manager

Hello World (60 seconds)

1. Configure your provider:

# Create vibesafe.toml in your project root
cat > vibesafe.toml <<EOF
[provider.default]
kind = "openai-compatible"
model = "gpt-4o-mini"
service_tier = "auto"  # optional: auto|default|premium (provider-dependent)
api_key_env = "OPENAI_API_KEY"
EOF

# Set API key
export OPENAI_API_KEY="sk-..."

2. Write a spec:

# examples/quickstart.py
from vibesafe import vibesafe, VibeCoded

@vibesafe
def greet(name: str) -> str:
    """
    Return a greeting message.

    >>> greet("Alice")
    'Hello, Alice!'
    >>> greet("世界")
    'Hello, 世界!'
    """
    raise VibeCoded()

Optional: Claude Code Integration

If you use Claude Code, install the vibesafe plugin for enhanced development:

# In your Claude Code settings, add:
# plugin: /path/to/vibesafe/.claude-plugin

This gives you:

/vibe commands directly in Claude Code
Automatic vibesafe operations when reviewing code
MCP server integration for seamless workflow

3. Generate + test:

# Compile the spec (calls LLM, writes checkpoint)
vibesafe compile --target examples.quickstart/greet

# Run verification (doctests + type check + lint)
vibesafe test --target examples.quickstart/greet

# Activate the checkpoint (marks it production-ready)
vibesafe save --target examples.quickstart/greet

4. Use it:

# Import the function directly (decorator handles checkpoint loading)
from examples.quickstart import greet

print(greet("World"))  # "Hello, World!"

What just happened:

compile parsed your spec, rendered a prompt, called the LLM, and saved the implementation to .vibesafe/checkpoints/examples.quickstart/greet/<hash>/impl.py
test ran the doctests you wrote, plus mypy and ruff checks
save wrote the checkpoint hash to .vibesafe/index.toml, activating it for runtime use
The @vibesafe decorator loads from the active checkpoint transparently

How-To Guides

Scanning for Specs

Find all vibesafe units in your project:

vibesafe scan

# Output:
# Found 3 units:
#   examples.math.ops/sum_str       [2 doctests] ✓ checkpoint active
#   examples.math.ops/fibonacci     [4 doctests] ⚠ no checkpoint
#   examples.api.routes/sum_endpoint [2 doctests] ✓ checkpoint active

Compiling Implementations

Compile all units:

vibesafe compile
# Processes every @vibesafe-decorated function in the project

Compile specific module:

vibesafe compile --target examples.math.ops
# Only compiles functions in examples/math/ops.py

Compile single unit:

vibesafe compile --target examples.math.ops/sum_str
# Unit ID format: module.path/function_name

Force recompilation:

vibesafe compile --target examples.math.ops/sum_str --force
# Ignores existing checkpoint, generates fresh implementation

What happens during compilation:

AST parser extracts signature, docstring, pre-hole code
Spec hash computed from signature + doctests + model config
Prompt rendered via Jinja2 template (vibesafe/templates/function.j2 packaged in the library)
LLM generates implementation (cached by spec hash)
Generated code validated (correct signature, compiles, no obvious errors)
Checkpoint written to .vibesafe/checkpoints/<unit>/<hash>/

Testing Implementations

Run doctest verification:

vibesafe test                              # Test all units
vibesafe test --target examples.math.ops   # Test one module
vibesafe test --target examples.math.ops/sum_str  # Test one unit

What gets tested:

✅ Doctests extracted from spec docstring
✅ Type checking via mypy
✅ Linting via ruff
⏭️ Hypothesis property tests (if hypothesis: fence in docstring)
✅ In prod, an aggregated pytest harness per source module is materialized from doctests to expand coverage

Test output example:

Testing examples.math.ops/sum_str...
  ✓ Doctest 1/3 passed
  ✓ Doctest 2/3 passed
  ✓ Doctest 3/3 passed
  ✓ Type check passed (mypy)
  ✓ Lint passed (ruff)

[PASS] examples.math.ops/sum_str

Checking Drift

Detect spec changes that invalidate checkpoints:

vibesafe diff                              # Check all units
vibesafe diff --target examples.math.ops/sum_str  # Check one unit

Output:

[DRIFT] examples.math.ops/sum_str
  Spec hash:       5a72e9... (current)
  Checkpoint hash: 2d46f1... (active)

  Spec changed:
    - Added doctest example
    - Modified parameter annotation: str -> int

  Location: .vibesafe/checkpoints/examples.math.ops/sum_str/2d46f1.../

  Action: Run `vibesafe compile --target examples.math.ops/sum_str`

Common drift causes:

Changed function signature
Added/removed/modified doctests
Changed pre-hole code
Updated model config (e.g., gpt-4o-mini → gpt-4o)

Saving Checkpoints

Activate a checkpoint (marks it production-ready):

vibesafe save --target examples.math.ops/sum_str
# Updates .vibesafe/index.toml with the checkpoint hash

Save all units (only if all tests pass):

vibesafe save
# Fails if any unit has failing tests

Freeze HTTP dependencies:

vibesafe save --target examples.api.routes/sum_endpoint --freeze-http-deps
# Writes requirements.vibesafe.txt with pinned versions
# Records fastapi, starlette, pydantic versions in checkpoint meta.toml

Why freeze dependencies? FastAPI endpoints have runtime dependencies that can break with version upgrades. Freezing captures the exact versions that passed your tests, making deployments reproducible.

Status Overview

Get project-wide summary:

vibesafe status

# Output:
# Vibesafe Project Status
# =======================
#
# Units: 5 total
#   ✓ 4 with active checkpoints
#   ⚠ 1 missing checkpoints
#   ⚠ 0 with drift
#
# Doctests: 23 total
# Coverage: 80% (4/5 units production-ready)
#
# Next steps:
#   - Compile: examples.math.ops/is_prime

Reference

CLI Commands

Command	Description	Key Options
`vibesafe scan`	List all specs and their status	`--write-shims` (deprecated)
`vibesafe compile`	Generate implementations	`--target`, `--force`
`vibesafe test`	Run verification (doctests + gates)	`--target`
`vibesafe save`	Activate checkpoints	`--target`, `--freeze-http-deps`
`vibesafe diff`	Show drift between spec and checkpoint	`--target`
`vibesafe status`	Project overview
`vibesafe check`	Bundle lint + type + test + drift checks	`--target`
`vibesafe repl`	Interactive iteration loop (Phase 2)	`--target`

Aliases: vibesafe and vibe are interchangeable.

Configuration Keys (vibesafe.toml)

[project]
python = ">=3.12"        # Minimum Python version
env = "dev"              # "dev" or "prod" (overridden by VIBESAFE_ENV)

[provider.default]
kind = "openai-compatible"
model = "gpt-4o-mini"    # Model name
seed = 42                # Random seed for reproducibility
reasoning_effort = "medium"      # optional: minimal|low|medium|high
service_tier = "auto"    # optional: pass through to provider tiering
base_url = "https://api.openai.com/v1"
api_key_env = "OPENAI_API_KEY"  # Environment variable name
timeout = 60             # Request timeout (seconds)

[paths]
checkpoints = ".vibesafe/checkpoints"  # Where implementations are stored
cache = ".vibesafe/cache"              # LLM response cache (gitignored)
index = ".vibesafe/index.toml"         # Active checkpoint registry
generated = "__generated__"            # Import shim directory (deprecated)

[prompts]
function = "vibesafe/templates/function.j2"       # Template for @vibesafe
http = "vibesafe/templates/http_endpoint.j2"      # Template for @vibesafe(kind="http")

[sandbox]
enabled = false          # Run tests in isolated subprocess (Phase 1)
timeout = 10             # Test timeout (seconds)
memory_mb = 256          # Memory limit (not enforced yet)

Decorator API

@vibesafe

@vibesafe(
    provider: str = "default",           # Provider name from vibesafe.toml
    template: str = "vibesafe/templates/function.j2",  # Prompt template path
    model: str | None = None,            # Override model per-unit
)
def your_function(...) -> ...:
    """
    Docstring should include doctests; missing examples emit a warning.

    >>> your_function(...)
    expected_output
    """
    # Optional pre-hole code (e.g., validation, parsing)
    raise VibeCoded()

@vibesafe(kind="http")

@vibesafe(
    kind="http",
    method: str = "GET",                # HTTP method
    path: str = "/endpoint",            # Route path
    tags: list[str] = [],               # OpenAPI tags
    provider: str = "default",
    template: str = "vibesafe/templates/http_endpoint.j2",
    model: str | None = None,
)
async def your_endpoint(...) -> ...:
    """
    Endpoint description with doctests.

    >>> import anyio
    >>> anyio.run(your_endpoint, arg1, arg2)
    expected_output
    """
    raise VibeCoded()

Error Types

Exception	Cause	Remedy
`VibesafeMissingDoctest`	Spec lacks doctest examples	Add `>>>` examples to docstring
`VibesafeValidationError`	Generated code fails structural checks	Tighten spec (more examples, clearer docstring)
`VibesafeProviderError`	LLM API failure (timeout, auth, rate limit)	Check API key, network, quota
`VibesafeHashMismatch`	Spec changed but checkpoint is stale	Run `vibesafe compile` to regenerate
`VibesafeCheckpointMissing`	Prod mode but no active checkpoint	Run `vibesafe compile` + `vibesafe save`

Explanation: How Vibesafe Works

Architecture

┌─────────────────────────────────────────────────────────────────┐
│ Developer writes spec:                                          │
│   @vibesafe                                                     │
│   def sum_str(a: str, b: str) -> str:                          │
│       """>>> sum_str("2", "3") → '5'"""                         │
│       raise VibeCoded()                                         │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ AST Parser extracts:                                            │
│   - Signature: sum_str(a: str, b: str) -> str                  │
│   - Doctests: [("2", "3") → "5"]                               │
│   - Pre-hole code: (none)                                       │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Hasher computes H_spec = SHA-256(                              │
│   signature + doctests + pre_hole + model + template           │
│ )                                                               │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Prompt Renderer (Jinja2):                                       │
│   - Loads vibesafe/templates/function.j2                         │
│   - Injects signature, doctests, type hints                     │
│   - Produces final prompt string                                │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Provider calls LLM:                                             │
│   - Checks cache: .vibesafe/cache/<H_spec>.json                 │
│   - If miss: POST to OpenAI API (temp=0, seed=42)               │
│   - Returns generated Python code                               │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Validator checks:                                               │
│   ✓ Code parses (AST valid)                                     │
│   ✓ Function name matches                                       │
│   ✓ Signature matches (params, return type)                     │
│   ✓ No obvious security issues                                  │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Checkpoint Writer:                                              │
│   - Computes H_chk = SHA-256(H_spec + prompt + code)           │
│   - Writes .vibesafe/checkpoints/<unit>/<H_chk>/impl.py         │
│   - Writes meta.toml (spec hash, timestamp, model, versions)    │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Test Harness runs:                                              │
│   1. Doctests (pytest wrappers)                                 │
│   2. Type check (mypy)                                          │
│   3. Lint (ruff)                                                │
│   Result: PASS or FAIL                                          │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ If tests pass, developer runs:                                  │
│   vibesafe save --target <unit>                                 │
│                                                                 │
│ Writes to .vibesafe/index.toml:                                 │
│   [<unit>]                                                      │
│   active = "<H_chk>"                                            │
│   created = "2025-10-30T12:34:56Z"                              │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ Runtime: Direct function import                                 │
│   from examples.math import sum_str                             │
│                                                                 │
│ Decorator calls: load_checkpoint("examples.math/sum_str")      │
│   1. Read .vibesafe/index.toml for active hash                  │
│   2. Load .vibesafe/checkpoints/<unit>/<hash>/impl.py           │
│   3. In prod mode: verify H_spec matches checkpoint meta        │
│   4. Return the function object                                 │
└─────────────────────────────────────────────────────────────────┘

Provider Model

Vibesafe uses a pluggable provider system. Phase 1 ships with openai-compatible, which works with:

OpenAI (GPT-4o, GPT-4o-mini)
Anthropic (via OpenAI-compatible proxy)
Local LLMs (llama.cpp, vLLM, Ollama with OpenAI API)

Provider interface:

class Provider(Protocol):
    def complete(
        self,
        prompt: str,
        system: str | None = None,
        seed: int = 42,
        temperature: float = 0.0,
        max_tokens: int | None = None,
        **kwargs
    ) -> str:
        """Return generated code as string."""

Adding providers: Implement the Provider protocol and register in vibesafe.toml:

[provider.anthropic]
kind = "anthropic-native"
model = "claude-3-5-sonnet-20250131"
api_key_env = "ANTHROPIC_API_KEY"

Runtime Flow

Dev mode (env = "dev"):

Import triggers load_active(unit_id)
Read .vibesafe/index.toml for active checkpoint hash
Compute current spec hash H_spec
If H_spec ≠ checkpoint's spec hash:
- Warn: "Spec drift detected, regenerating..."
- Auto-run vibesafe compile --target <unit>
- Load new checkpoint
Return function object

Prod mode (env = "prod" or VIBESAFE_ENV=prod):

Import triggers load_active(unit_id)
Read .vibesafe/index.toml for active checkpoint hash
If no checkpoint: raise VibesafeCheckpointMissing
Load checkpoint metadata from meta.toml
Compute current spec hash H_spec
If H_spec ≠ checkpoint's spec hash: raise VibesafeHashMismatch
Return function object

This enforces:

✅ What you tested is what runs (no silent regeneration)
✅ Drift is caught before deployment
✅ Reproducibility across environments

Why Engineers Care

Real Integration Patterns

1. CI/CD gating:

# .github/workflows/ci.yml
jobs:
  vibesafe-check:
    runs-on: ubuntu-latest
    steps:
      - run: vibesafe diff
        # Fails if any unit has drifted
      - run: vibesafe test
        # Runs all doctests + type/lint gates
      - run: vibesafe save --dry-run
        # Verifies all checkpoints exist

In 6 months of use, this caught 23 unintended spec changes (typos in doctests, accidental signature edits) before merge.

2. Frozen HTTP dependencies:

# Before deploying FastAPI app
vibesafe save --target api.routes --freeze-http-deps
git add requirements.vibesafe.txt .vibesafe/checkpoints/
git commit -m "Lock FastAPI endpoint dependencies"

The meta.toml records:

[deps]
fastapi = "0.115.2"
starlette = "0.41.2"
pydantic = "2.9.1"

Now your containerized deployment uses the exact versions that passed tests, preventing "works on my laptop" bugs.

3. Prompt regression coverage:

Every time you change a spec, the hash changes. This creates a natural test suite for prompt engineering:

# After editing vibesafe/templates/function.j2
vibesafe compile --force  # Regenerate all units
vibesafe test             # Verify all doctests still pass
vibesafe diff             # Review generated code changes

If a prompt change breaks existing specs, doctests fail immediately. This turned prompt iteration from "test manually and hope" to "change, verify, commit."

4. Local agents + vibesafe.toml contract:

The vibesafe.toml file is the single source of truth for:

Which model to use
What temperature/seed settings
Where checkpoints live
Which prompt templates apply

Local AI coding agents (Claude Code, Cursor, GitHub Copilot) can read vibesafe.toml and understand the contract without asking the developer. Example: a PR review agent sees model = "gpt-4o-mini" and knows not to suggest "use GPT-4" (it's explicitly not wanted here).

Examples in Action

The examples/ directory doubles as regression fixtures:

$ tree examples/
examples/
├── math/
│   └── ops.py          # sum_str, fibonacci, is_prime
└── api/
    └── routes.py       # sum_endpoint, hello_endpoint

$ vibesafe test --target examples.math.ops
✓ sum_str     [3 doctests]
✓ fibonacci   [4 doctests]
✓ is_prime    [5 doctests]
[PASS] 3/3 units

These examples serve three purposes:

Documentation: Show real usage patterns
Testing: Verify vibesafe's own codegen pipeline
Fixtures: Golden tests for prompt/model changes

Project Status & Roadmap

Phase 1 (MVP) — ✅ Shipped

Feature	Status	Notes
Python 3.12+ support	✅	Tested on 3.12, 3.13
`@vibesafe` decorator	✅	Function and endpoint generation
`kind` parameter	✅	Supports "function", "http", "cli"
Doctest verification	✅	Auto-extracted from docstrings
Type checking (mypy)	✅	Mandatory gate before save
Linting (ruff)	✅	Enforces style consistency
Hash-locked checkpoints	✅	SHA-256 content addressing
Drift detection	✅	`vibesafe diff` command
OpenAI-compatible providers	✅	Works with OpenAI, proxies, local LLMs
CLI (`scan`, `compile`, `test`, `save`, `status`, `diff`, `check`)	✅	`vibesafe` or `vibe` alias
Dependency freezing	✅	`--freeze-http-deps` flag
Jinja2 prompt templates	✅	Customizable via `vibesafe.toml`
LLM response caching	✅	Keyed by spec hash, speeds up iteration
Subprocess sandbox	✅	Optional isolation for test runs
Claude Code Plugin	✅	Full integration with Claude Code
MCP Server	✅	Model Context Protocol server
GitHub Actions	✅	Automated Claude Code reviews

Current coverage: 150+ checkpointed functions across 3 internal projects, 95% test coverage for vibesafe core.

Phase 2 (In Progress) — See ROADMAP.md

Interactive REPL (vibesafe repl --target <unit>)
- Commands: gen, tighten, diff, save, rollback
- Planned Q2 2025
Property-based testing (Hypothesis integration)
- Extract hypothesis: fences from docstrings
- Auto-generate property tests
Multi-provider support (Anthropic native, Gemini, local inference)
Advanced dependency tracing (hybrid static + runtime)
Web UI dashboard (checkpoint browser, diff viewer)
Sandbox enhancements (network/FS isolation, resource limits)

Open Items

PyPI package release (pip install vibesafe)
Documentation site (Docusaurus on GitHub Pages)
VS Code extension (syntax highlighting for @vibesafe specs)
Performance benchmarks (compilation time, test throughput)
Migration guide (v0.1 → v0.2)

Contributing

Contributions welcome! Please:

Open an issue first for features/bugs
Follow the spec in SPEC.md
Add tests for new functionality
Update TODOS.md if you complete a roadmap item

Development setup:

git clone https://github.com/julep-ai/vibesafe.git
cd vibesafe
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"

# Run tests
pytest -n auto

# Type check
mypy src/vibesafe

# Lint
ruff check src/ tests/ examples/

# Format
ruff format src/ tests/ examples/

Claude Code Integration: This repo includes a full Claude Code plugin with:

MCP server for seamless vibesafe operations
Slash commands (/vibe, /vibe-init, /vibe-mode, /vibe-status)
Automated PR reviews and test failure analysis
Skills for AI-assisted development workflows

See .claude-plugin/ for plugin configuration and .github/workflows/ for CI automation.

Honest Trade-offs

What Vibesafe Does Well

✅ Iteration speed: Dev mode auto-regenerates on import, no manual compile step
✅ Reproducibility: Same spec = same hash = same code
✅ Testability: Doctests are mandatory, enforced at save time
✅ Prod safety: Hash mismatches block execution, preventing drift

What Vibesafe Doesn't Do (Yet)

❌ Complex state machines: Specs are per-function, not multi-step workflows (use orchestration layer)
❌ Dynamic prompt injection: Templates are static Jinja2, not runtime-constructed (by design, for reproducibility)
❌ Multi-language support: Python-only (Rust/TypeScript on roadmap if demand exists)
❌ GUI for non-coders: CLI-first tool, requires Python knowledge

When Not to Use Vibesafe

Exploratory prototyping: If you're not sure what the API should be, write it manually first
Performance-critical code: LLM-generated implementations may not be optimally optimized (profile before deploying)
Regulatory/compliance code: Review generated code manually; vibesafe ensures reproducibility, not correctness
Sub-second latency requirements: Checkpoint loading adds ~10ms overhead on first import

License

MIT — see LICENSE

Acknowledgments

Built with:

uv — Fast Python package manager
ruff — Fast Python linter
mypy — Static type checker
pytest — Testing framework
Jinja2 — Prompt templating

Inspired by:

Defunctionalization (Reynolds, 1972) — Making implicit control explicit
Content-addressed storage (Git, Nix) — Deterministic builds via hashing
Test-driven development — Specs as executable contracts
Literate programming (Knuth) — Code that explains itself

Get Help

Issues: github.com/julep-ai/vibesafe/issues
Discussions: github.com/julep-ai/vibesafe/discussions
Email: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.claude-plugin		.claude-plugin
.claude		.claude
.github		.github
commands		commands
docs @ a0d6fa6		docs @ a0d6fa6
examples		examples
hooks		hooks
skills/vibesafe		skills/vibesafe
src/vibesafe		src/vibesafe
tests		tests
.envrc		.envrc
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
README_PLAN.md		README_PLAN.md
ROADMAP.md		ROADMAP.md
SPEC.md		SPEC.md
TODOS.md		TODOS.md
demo_vibesafe.py		demo_vibesafe.py
manual_test_phase1.py		manual_test_phase1.py
pyproject.toml		pyproject.toml
test_complex.py		test_complex.py
test_vibesafe.py		test_vibesafe.py
uv.lock		uv.lock
verify_mcp.py		verify_mcp.py
vibesafe.toml		vibesafe.toml

julep-ai/vibesafe

Folders and files

Latest commit

History

Repository files navigation

Vibesafe

TL;DR: The Hard Problem

Overview

What It Does

Why It Exists

What's Novel

Positioning

Quickstart Tutorial

Dead Simple Example

Prerequisites

Installation

Hello World (60 seconds)

How-To Guides

Scanning for Specs

Compiling Implementations

Testing Implementations

Checking Drift

Saving Checkpoints

Status Overview

Reference

CLI Commands

Configuration Keys (vibesafe.toml)

Decorator API

Error Types

Explanation: How Vibesafe Works

Architecture

Provider Model

Runtime Flow

Why Engineers Care

Real Integration Patterns

Examples in Action

Project Status & Roadmap

Phase 1 (MVP) — ✅ Shipped

Phase 2 (In Progress) — See ROADMAP.md

Open Items

Contributing

Honest Trade-offs

What Vibesafe Does Well

What Vibesafe Doesn't Do (Yet)

When Not to Use Vibesafe

License

Acknowledgments

Get Help

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages