Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion demos/filter_chains/model_listener_filter/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ WORKDIR /app

RUN pip install --no-cache-dir fastapi uvicorn pydantic

COPY content_guard.py .
COPY content_guard.py fake_provider.py output_filter.py ./

EXPOSE 10500

Expand Down
127 changes: 115 additions & 12 deletions demos/filter_chains/model_listener_filter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,30 @@

Run content-safety filters on direct LLM requests — no agent layer required.

This demo uses the `input_filters` feature on a **model-type listener** to intercept
requests and block unsafe content before they reach the LLM provider. Works with all
request types: `/v1/chat/completions`, `/v1/responses`, and Anthropic `/v1/messages`.

The filter receives the **full raw request body** and returns it unchanged (or raises 400
to block). No message extraction — the complete JSON payload flows through as-is.
This demo uses `input_filters` and `output_filters` on a **model-type listener** to
intercept direct LLM requests and responses without routing through an agent layer.
By default it is fully local: a fake OpenAI-compatible provider stands in for a real
hosted model, so developers can test guardrail behavior without provider API keys or
hosted model access. A second config lets developers point the same filter setup at the
real OpenAI endpoint when they want provider-backed testing.
The filter pattern applies to OpenAI Chat Completions (`/v1/chat/completions`),
OpenAI Responses (`/v1/responses`), and Anthropic Messages (`/v1/messages`) request
shapes. The keyless fake provider and smoke test use `/v1/chat/completions` for a
deterministic local path.

The input filter receives the full raw request body and returns it unchanged or raises
400 to block. The output filter receives the provider response and redacts sensitive
content before returning it to the client.

## Files

- `config.yaml` runs the default keyless path with the local fake provider.
- `config.openai.yaml` runs the same filters against OpenAI.
- `docker-compose.yaml` starts the local demo without requiring provider credentials.
- `docker-compose.openai.yaml` mounts `config.openai.yaml` and requires `OPENAI_API_KEY`
for provider-backed testing.
- `test.sh` runs the Docker smoke test through Plano.
- `test_services.py` runs service-level regression tests without Docker.

## Architecture

Expand All @@ -16,22 +34,82 @@ Client ──► Plano (model listener :12000)
├─ input_filters: content_guard ──► Block / Allow
└─ model_provider: openai/gpt-4o-mini
├─ model_provider: fake-provider (default) or OpenAI (optional)
└─ output_filters: output_redactor ──► Redact / Allow
```

## Quick Start

```bash
# 1. Export your API key
export OPENAI_API_KEY=sk-...

# 2. Start services
# 1. Start services
docker compose up --build

# 3. Run tests (in another terminal)
# 2. Run tests (in another terminal)
bash test.sh
```

The test script verifies three behaviors:

- safe requests reach the local fake provider and return a normal chat-completion response
- unsafe requests are blocked by the input filter before reaching the provider
- sensitive provider output is redacted by the output filter before the client receives it

You can also run the service-level tests without Docker:

```bash
uv run --with pytest --with fastapi --with httpx --with pydantic \
python -m pytest demos/filter_chains/model_listener_filter/test_services.py -q
```

## Validate Locally

From this directory, validate the default keyless compose path:

```bash
docker compose config
```

Validate that the OpenAI path fails early when the API key is missing:

```bash
docker compose -f docker-compose.yaml -f docker-compose.openai.yaml config
```

Expected error:

```text
OPENAI_API_KEY environment variable is required but not set
```

Then confirm the OpenAI compose path renders when a key is provided:

```bash
OPENAI_API_KEY=dummy docker compose -f docker-compose.yaml -f docker-compose.openai.yaml config
```

Run the full local smoke test:

```bash
docker compose down
docker compose up --build -d
bash test.sh
docker compose down
```

## Test With Real OpenAI

The default `config.yaml` uses the local fake provider. To run the same model-listener
input and output filters against OpenAI, use the OpenAI compose override:

```bash
export OPENAI_API_KEY=sk-...
docker compose -f docker-compose.yaml -f docker-compose.openai.yaml up --build
```

The fake-provider service may still start because it is part of the shared compose file,
but Plano will not route traffic to it when `config.openai.yaml` is mounted.

## Try It

**Allowed request:**
Expand All @@ -58,6 +136,31 @@ curl http://localhost:12000/v1/chat/completions \
}'
```

**Redacted provider response:**

```bash
curl http://localhost:12000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Please return the secret marker"}],
"stream": false
}'
```

The fake provider emits `SECRET_TOKEN`; the output filter redacts it to `[REDACTED]`.

## Why This Helps Developers

Model-listener filters are guardrails for applications that call Plano as a transparent
LLM gateway. A local, deterministic demo helps developers verify filter wiring before
using real providers:

- config mistakes are caught early instead of silently bypassing guardrails
- teams can test request blocking and response redaction in CI without secrets
- contributors can reproduce filter behavior without external model availability
- application code does not need an extra passthrough agent just to run policy checks

## Tracing

Open [Jaeger UI](http://localhost:16686) to see distributed traces for both allowed and blocked requests.
26 changes: 26 additions & 0 deletions demos/filter_chains/model_listener_filter/config.openai.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
version: v0.3.0

filters:
- id: content_guard
url: http://content-guard:10500
type: http
- id: output_redactor
url: http://output-filter:10502
type: http

model_providers:
- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
default: true

listeners:
- type: model
name: llm_gateway
port: 12000
input_filters:
- content_guard
output_filters:
- output_redactor

tracing:
random_sampling: 100
8 changes: 7 additions & 1 deletion demos/filter_chains/model_listener_filter/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,14 @@ filters:
- id: content_guard
url: http://content-guard:10500
type: http
- id: output_redactor
url: http://output-filter:10502
type: http

model_providers:
- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
access_key: local-demo-key
base_url: http://fake-provider:10501/v1
default: true

listeners:
Expand All @@ -16,6 +20,8 @@ listeners:
port: 12000
input_filters:
- content_guard
output_filters:
- output_redactor

tracing:
random_sampling: 100
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
services:
plano:
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
volumes:
- ./config.openai.yaml:/app/plano_config.yaml
22 changes: 20 additions & 2 deletions demos/filter_chains/model_listener_filter/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,35 @@ services:
dockerfile: Dockerfile
ports:
- "10500:10500"
fake-provider:
build:
context: .
dockerfile: Dockerfile
command: ["uvicorn", "fake_provider:app", "--host", "0.0.0.0", "--port", "10501"]
ports:
- "10501:10501"
output-filter:
build:
context: .
dockerfile: Dockerfile
command: ["uvicorn", "output_filter:app", "--host", "0.0.0.0", "--port", "10502"]
ports:
- "10502:10502"
plano:
build:
context: ../../../
dockerfile: Dockerfile
ports:
- "12000:12000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
- OPENAI_API_KEY=${OPENAI_API_KEY:-}
volumes:
- ./config.yaml:/app/plano_config.yaml
- ${PLANO_CONFIG_FILE:-./config.yaml}:/app/plano_config.yaml
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
depends_on:
- content-guard
- fake-provider
- output-filter
jaeger:
build:
context: ../../shared/jaeger
Expand Down
81 changes: 81 additions & 0 deletions demos/filter_chains/model_listener_filter/fake_provider.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
"""
OpenAI-compatible local provider for model-listener filter demos.

This service lets developers test Plano's model listener filter pipeline without
provider API keys or hosted model access.
"""

import json
import time
from typing import Any

from fastapi import FastAPI, Request
from fastapi.responses import Response, StreamingResponse

app = FastAPI(title="Local Fake LLM Provider", version="1.0.0")


def latest_user_content(messages: list[dict[str, Any]]) -> str:
for message in reversed(messages):
if message.get("role") == "user":
content = message.get("content", "")
if isinstance(content, str):
return content
if isinstance(content, list):
return " ".join(
part.get("text", "")
for part in content
if isinstance(part, dict) and part.get("type") == "text"
)
return ""


@app.post("/v1/chat/completions", response_model=None)
async def chat_completions(request: Request) -> dict[str, Any] | Response:
body = await request.json()
model = body.get("model", "gpt-4o-mini")
user_content = latest_user_content(body.get("messages", []))
content = "Hello from the local fake provider."
if "secret" in user_content.lower():
content = "The local fake provider returned SECRET_TOKEN."

if body.get("stream") is True:

async def generate():
chunk = {
"id": "chatcmpl-local-filter-demo",
"object": "chat.completion.chunk",
"created": int(time.time()),
"model": model,
"choices": [
{
"index": 0,
"delta": {"role": "assistant", "content": content},
"finish_reason": None,
}
],
}
yield f"data: {json.dumps(chunk)}\n\n"
yield "data: [DONE]\n\n"

return StreamingResponse(generate(), media_type="text/event-stream")

return {
"id": "chatcmpl-local-filter-demo",
"object": "chat.completion",
"created": int(time.time()),
"model": model,
"choices": [
{
"index": 0,
"message": {"role": "assistant", "content": content},
"finish_reason": "stop",
}
],
"usage": {"prompt_tokens": 1, "completion_tokens": 1, "total_tokens": 2},
}


@app.get("/health")
async def health() -> dict[str, str]:
return {"status": "healthy"}
Loading
Loading