Admin API inspection: SSE event stream + load stats endpoint#310
Conversation
Adds `torc admin tail-api`, which streams a structured event for every inbound HTTP request via Server-Sent Events from a new admin endpoint (`GET /admin/api-events/stream`). Useful for debugging traffic against a running server without tailing log files. Each event carries method, path, query, status, latency, span id, and the authenticated user. With `--include-bodies`, the server also captures request and response bodies, capped at 8 KiB per direction (override via `TORC_API_EVENT_BODY_MAX_BYTES`) and skipped entirely above a 1 MiB hard buffer ceiling. Bodies are off by default so payloads aren't streamed unless requested; SSE response streams are skipped to avoid unbounded buffering, and `Authorization`/`Cookie` headers are never captured. When no admin client is connected the capture middleware short-circuits so the runtime cost on the request hot path is negligible. To make the `user` field useful even when the server runs without `--auth-file`, the torc CLI sends an advisory `X-Torc-Client-User` header sourced from `TORC_USERNAME`/`USER`/`USERNAME`; the middleware uses it only as a fallback when no real authentication was resolved, never for authorization. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Adds `torc admin api-stats`, which renders request rate, throughput, and 2xx/4xx/5xx status mix from a 1-hour ring of per-second counters maintained by the server. The capture middleware records every request into the ring regardless of whether anyone is connected to `tail-api`, so the snapshot is always up to date. Bytes are read from `Content-Length` request and response headers — fast and zero-overhead on the hot path. Chunked / streaming responses (notably the SSE event streams themselves) advertise no length and contribute 0 bytes; the request itself is still counted. The new endpoint is `GET /admin/api-stats` with optional `window_seconds` and `interval_seconds` query parameters (defaults: 3600 / 60). The CLI accepts `--window` and `--interval` to mirror those, and supports `-f json` for raw output. Per-request overhead is ~300 ns (two header lookups + one wall-clock read + an uncontended parking_lot mutex), and `Instant::now()` is elided on the no-subscriber fast path so we don't pay for it when nobody's listening. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds live inspection and load observability for the server’s HTTP API by introducing an admin SSE stream of per-request events and an aggregated “busy-ness” stats endpoint, plus CLI support for both (including an advisory client username header to improve labeling when auth is disabled).
Changes:
- Add
GET /admin/api-events/streamSSE endpoint + request-capture middleware +torc admin tail-api. - Add
GET /admin/api-statsbacked by a 1-second ring buffer +torc admin api-stats. - Add advisory
X-Torc-Client-Userheader emission from the CLI for betteruserlabeling when no real auth is resolved.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/server/live_state.rs | Adds new shared state for API event broadcasting and API stats ring buffer. |
| src/server/live_router.rs | Wires new admin routes and installs capture middleware; adds endpoint handlers + tests. |
| src/server/api_stats.rs | Implements per-second ring buffer and snapshot aggregation for API load stats. |
| src/server/api_event_stream.rs | Implements broadcast channel + event/body types and body-capture limit helpers. |
| src/server.rs | Exposes new server modules. |
| src/client/sse_client.rs | Sends advisory client-user header on SSE connections. |
| src/client/commands/admin.rs | Adds tail-api and api-stats admin commands (SSE parsing + stats rendering). |
| src/client/apis/configuration.rs | Adds advisory client-user header constant and auto-injection in auth application. |
| docs/src/specialized/admin/server-deployment.md | Documents new live request inspection + load stats features and env var. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let bus = state.server.api_event_broadcaster.clone(); | ||
| let mut receiver = bus.subscribe(); | ||
| let body_guard = if params.include_bodies.unwrap_or(false) { | ||
| Some(bus.body_subscriber_guard()) | ||
| } else { | ||
| None | ||
| }; |
There was a problem hiding this comment.
This was addressed in commit 20ef130 via the new redact_for_subscriber helper. The broadcaster still sends one event with bodies attached when any subscriber wants them, but each SSE handler clears request_body/response_body per-connection before serializing if that connection didn't pass include_bodies=true. The redact_for_subscriber_* unit tests cover both directions.
| let new_req = Request::from_parts(parts, Body::from(bytes)); | ||
| (new_req, captured) | ||
| } | ||
| Err(_) => (Request::from_parts(parts, Body::empty()), None), | ||
| } |
| let new_resp = Response::from_parts(parts, Body::from(bytes)); | ||
| (new_resp, captured) | ||
| } | ||
| Err(_) => (Response::from_parts(parts, Body::empty()), None), | ||
| } |
| .fallback(dashboard_fallback) | ||
| .layer(middleware::from_fn_with_state( | ||
| CaptureState { | ||
| bus: state.server.api_event_broadcaster.clone(), | ||
| stats: state.server.api_stats.clone(), | ||
| }, | ||
| capture_api_event, | ||
| )) | ||
| .layer(middleware::from_fn_with_state( | ||
| state.auth.clone(), | ||
| inject_request_context, | ||
| )) | ||
| .with_state(state) |
There was a problem hiding this comment.
This was addressed in commit 20ef130: record_api_stats is now the outermost layer (lines 396-401), sitting outside the auth short-circuit. The api_stats_records_unauthenticated_requests test verifies that a 401 still lands in the ring as a 4xx.
- Split capture middleware: load-stats accounting moves to an outermost layer that lives outside the auth check, so 401s and other unauthenticated traffic now appear in /admin/api-stats. Event broadcasting stays inside auth where the request context is available. - Per-subscriber body redaction in admin_api_events_stream: bodies are captured globally when any subscriber wants them, but a metadata-only subscriber no longer sees payloads from other subscribers. - Tighten body capture: only collect bodies whose size is advertised up front (Content-Length or body size hint). Chunked uploads with no advertised length are passed through untouched, closing an unbounded- buffer DoS gap. - Tighten the include_bodies doc comment and the markdown's body-capture description to match the actual implementation. - New tests cover all three behaviors. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
| let new_req = Request::from_parts(parts, Body::from(bytes)); | ||
| (new_req, captured) | ||
| } | ||
| Err(_) => (Request::from_parts(parts, Body::empty()), None), | ||
| } |
There was a problem hiding this comment.
This was addressed in commit 5166867: the request Err(_) branch now returns Err(error_response(StatusCode::BAD_REQUEST, ...)) and the middleware short-circuits with a 400 instead of forwarding an empty body to the handler. capture_short_circuits_on_request_body_error exercises this path.
| let new_resp = Response::from_parts(parts, Body::from(bytes)); | ||
| (new_resp, captured) | ||
| } | ||
| Err(_) => (Response::from_parts(parts, Body::empty()), None), |
There was a problem hiding this comment.
Also addressed in commit 5166867: the response Err(_) branch now returns Err(error_response(StatusCode::BAD_GATEWAY, ...)). The middleware substitutes a fresh 502 response (status, headers, body) rather than truncating the original handler's response to Body::empty().
| }; | ||
|
|
||
| let display_limit = body_capture_limit(); | ||
|
|
||
| let (request, request_body) = if want_bodies { | ||
| capture_request_body(request, display_limit).await | ||
| } else { | ||
| (request, None) | ||
| }; |
| Ok(mut event) => { | ||
| redact_for_subscriber(&mut event, include_bodies); | ||
| let data = serde_json::to_string(&event).unwrap_or_default(); | ||
| yield Ok::<_, std::convert::Infallible>(format!( | ||
| "event: api\ndata: {}\n\n", | ||
| data | ||
| )); |
Previously, if `body.collect().await` failed mid-stream (client disconnect, transport error, etc.), the middleware substituted `Body::empty()` and let the handler fail with a misleading deserialization error. Replace both fallbacks with explicit error responses synthesized in the middleware: - Request body read error → 400 Bad Request, before the handler runs - Response body read error → 502 Bad Gateway, replacing the broken response (the handler had returned, but no bytes were on the wire yet) Adds a regression test that drives a streaming request body which errors during collection and asserts the 400 short-circuit. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Summary
torc admin tail-api, an SSE stream that emits a structured event for every inbound HTTP request the server processes (method, path, status, latency, span id, optional bodies). Off-by-default body capture is opt-in via--include-bodies, capped at 8 KiB per direction with a 1 MiB hard buffer ceiling, and never capturesAuthorization/Cookieheaders.torc admin api-stats, a snapshot of how busy the server is over the last hour (request count, req/s, bytes in/out fromContent-Length, 2xx/4xx/5xx breakdown), backed by a 1-second-bucket ring buffer inLiveServerState. Configurable--windowand--interval.X-Torc-Client-Userheader sent by the CLI so theuserfield in events is meaningful even when the server runs without--auth-file. The header is trivially spoofable and is never used for authorization — only as a fallback label when no real auth was resolved.Implementation notes
/admin/reload-auth:GET /admin/api-events/streamandGET /admin/api-stats. They use standard server authentication; no admin-only role.parking_lotmutex).Instant::now()is elided on the no-subscriber fast path./workflows/{id}/events/stream) are skipped by both body capture and byte counting since they are unbounded.src/server/api_event_stream.rs(broadcaster, body cap helpers) andsrc/server/api_stats.rs(ring buffer + snapshot).server-deployment.md, plus theTORC_API_EVENT_BODY_MAX_BYTESenv var.Test plan
cargo fmt -- --checkcargo clippy --all --all-targets --all-features -- -D warningsdprint checkcargo test --all-features --lib api_event_stream(7 tests)cargo test --all-features --lib api_stats(5 tests)cargo test --all-features --lib live_router(12 tests, includes 4 new ones for capture middleware + api-stats endpoint)torc-server run, hit a few endpoints, thentorc admin tail-apiandtorc admin api-statsfrom another shell — confirm events stream and stats report sensible numbers--auth-fileset, confirmuserfield shows the authenticated subject (not the advisory header)🤖 Generated with Claude Code