Skip to content

feat(prd-142): Wave 0 measurement backend (real dashboard metrics)#400

Merged
AutomatosAI merged 7 commits into
mainfrom
ralph/prd-142-wave0-measurement
May 30, 2026
Merged

feat(prd-142): Wave 0 measurement backend (real dashboard metrics)#400
AutomatosAI merged 7 commits into
mainfrom
ralph/prd-142-wave0-measurement

Conversation

@AutomatosAI
Copy link
Copy Markdown
Owner

@AutomatosAI AutomatosAI commented May 30, 2026

Summary

PRD-142 Wave 0 — the "Is it working?" measurement backend. Replaces fake/placeholder analytics with real, workspace-scoped metrics feeding the existing dashboard. Backend + instrumentation only — no new page, no new route, no frontend in this PR.

  • US-001 — persist record_error() to a queryable error_events sink (new table + model + best-effort write path that never raises and rolls back cleanly).
  • US-002GET /api/analytics/errors/by-subsystem: real error-rate-by-subsystem, workspace-scoped, windowed (?window=24h), backed by an index on (workspace_id, created_at).
  • US-003 — removed fake agent-metrics placeholders from analytics_engine.py.
  • US-004GET /api/analytics/widget-engagement: real widget engagement, tenant-isolated via Site→workspace resolution.
  • US-005GET /api/analytics/activation: platform-level activation (distinct workspaces with ≥1 completed run / total workspaces).
  • Wave 0 Ralph harness (scripts/ralph/) — prompt, loop script, story spec.

Scope / deferrals

  • US-006 (per-primitive health tile) — deferred to Wave 3. No honest per-primitive data source exists yet: the heartbeat writer emits only operational findings (agent_health, checklist, …), none mapped to the 8 product primitives. Building now would ship a hollow all-"unknown" tile. Contract + sourcing options recorded for Wave 3.
  • US-007 (frontend tiles) and US-008 (live Railway gate) — not in this PR; hands-on, separate.

Test plan

  • 21/21 Wave 0 unit tests green (mocked DB + FastAPI TestClient, dependency-overridden ctx/db).
  • Endpoints reviewed for tenant isolation (workspace_id scoping) and divide-by-zero safety.
  • Migration reviewed — downgrade present, drops table + index cleanly.
  • Rebased onto main (PR fix(prd-140): unblock + harden platform-tool authz gate (org-chart writes) #399 / PRD-140) — zero file overlap, clean merge.
  • Post-merge: Railway applies migration (alembic upgrade heads); confirm error_events table created.
  • Post-merge: hit the 3 endpoints on a deployed env; confirm real (non-placeholder) data renders.

Summary by CodeRabbit

Release Notes

  • New Features

    • Error rate analytics by subsystem with rolling-window filtering
    • Widget engagement metrics with distinct session counting
    • Platform activation rate measurement for compliance tracking
    • Persistent error event logging system supporting dashboard visibility
  • Tests

    • Added comprehensive test coverage for new analytics endpoints and error logging functionality

Review Change Stack

The dashboard reported a hardcoded 85.0% success rate, 2.5s avg execution
time, and 0 tokens for agents — none backed by a real source. Drop these
placeholders from _get_agent_metrics and get_agent_analytics so the API
returns only real, DB-derived fields. track_agent_execution only logs to
Redis, so there is no persisted per-agent source to derive from yet;
mission-level success rate (real) lives in _get_workflow_metrics.

No consumer reads the removed fields (verified frontend + backend). Adds
tests asserting the fakes are gone and success rate is computed, not 85.0.
Tailored autonomous build harness for the 5 backend stories
(US-001/002/004/005/006): PROMPT_build_prd142.md scopes the run, uses the
prd-142-wave0.json `passes` field as source of truth, enforces TDD, the
reuse map, workspace-scoping + tenant tests, and commits branch-local only
(never push/merge). loop-prd142.sh drives it with usage-limit backoff and
a RALPH_COMPLETE/RALPH_BLOCKED stop. US-007 (frontend) and US-008 (live
gate) are explicitly out of scope for the loop. US-003 marked passes:true.
…s sink

Adds the error_events table (alembic + ORM model + best-effort writer)
that backs Wave 0's "error rate by subsystem" tile. Mirrors the PRD-008-A
widget_event_log pattern: single table, JSONB payload, fire-and-forget
writer that never propagates. record_error's signature and never-raises
contract are unchanged; the automatos.errors logger emit still happens
first so a DB outage cannot blind-spot a failure.

Story: scripts/ralph/prd-142-wave0.json US-001
PRD: docs/PRDS/PRD-142-WAVE0-MEASUREMENT.md
Add GET /api/analytics/errors/by-subsystem?window=24h to the existing
api/analytics_real.py router (no new router file). Workspace-scoped via
ctx.workspace_id; aggregates the error_events sink (US-001) by subsystem
over a rolling window. Returns {window,total,by_subsystem,generated_at}
with per-row rate=count/total and zero-safety. Indexes from US-001 carry
the (workspace_id, created_at) + GROUP BY subsystem path.

Story: scripts/ralph/prd-142-wave0.json US-002
PRD: docs/PRDS/PRD-142-WAVE0-MEASUREMENT.md
Adds read-only GET /api/analytics/widget-engagement to existing
api/analytics_real.py router (no new file). Reuses widget_event_log
(PRD-008-A) — workspace scoped via Site.workspace_id → site_id.in_(),
restricted to WIDGET_EVENT_TYPES so idx_widget_event_log_type_created
is eligible alongside the created_at window. Sessions counted as
distinct session_id. Endpoint constructs zero WidgetEventLog rows; the
telemetry writer remains the single source of truth.

Story: scripts/ralph/prd-142-wave0.json US-004
PRD: docs/PRDS/PRD-142-WAVE0-MEASUREMENT.md
Adds GET /api/analytics/activation to the existing api/analytics_real.py
router (no new router file). Reuses the OrchestrationRun union the rest
of analytics_real already queries (api/analytics_real.py:69) and the
canonical RunState.COMPLETED.value enum from core/models/orchestration_enums.py
— never the literal "completed" string. Intentionally platform-level
(per US-005 notes): auth gated via get_request_context_hybrid but NO
workspace_id filter on the data. Returns {activated, total_workspaces,
rate, generated_at}; rate=0 when total_workspaces=0 with no divide-by-zero
and no fake fallback.

Story: scripts/ralph/prd-142-wave0.json US-005
PRD: docs/PRDS/PRD-142-WAVE0-MEASUREMENT.md
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0889e219-65c7-4ae7-9266-1cfbb04454e7

📥 Commits

Reviewing files that changed from the base of the PR and between 21f6db7 and 2955cb5.

📒 Files selected for processing (13)
  • orchestrator/alembic/versions/prd142_wave0_error_events.py
  • orchestrator/api/analytics_real.py
  • orchestrator/core/models/error_event.py
  • orchestrator/core/services/analytics_engine.py
  • orchestrator/core/utils/exception_telemetry.py
  • orchestrator/tests/test_activation_endpoint.py
  • orchestrator/tests/test_analytics_engine_real_metrics.py
  • orchestrator/tests/test_error_events_sink.py
  • orchestrator/tests/test_errors_by_subsystem_endpoint.py
  • orchestrator/tests/test_widget_engagement_endpoint.py
  • scripts/ralph/PROMPT_build_prd142.md
  • scripts/ralph/loop-prd142.sh
  • scripts/ralph/prd-142-wave0.json

📝 Walkthrough

Walkthrough

This PR implements PRD-142 Wave 0 "Measurement First," adding error event persistence, three analytics endpoints (errors by subsystem, widget engagement, platform activation rate), and cleaning up hardcoded metrics placeholders. All changes include test coverage and Ralph automation tooling to track completion.

Changes

Error Events & Analytics Endpoints

Layer / File(s) Summary
Error Event Model and Persistence Sink
orchestrator/alembic/versions/prd142_wave0_error_events.py, orchestrator/core/models/error_event.py, orchestrator/core/utils/exception_telemetry.py, orchestrator/tests/test_error_events_sink.py
Alembic migration creates error_events table with JSONB payload and compound indexes; SQLAlchemy ORM model defines schema; telemetry sink adds best-effort database persistence to record_error with graceful fallback when DB unavailable, field truncation (500 char message limit), and exception swallowing to prevent masking original errors. Comprehensive tests validate model structure, happy-path persistence, rollback/commit failure recovery, and signature immutability.
Errors by Subsystem Endpoint
orchestrator/api/analytics_real.py (error endpoint + window helper), orchestrator/tests/test_errors_by_subsystem_endpoint.py
Window-parsing helper converts 24h/7d query strings to timedeltas. GET /api/analytics/errors/by-subsystem aggregates ErrorEvent rows by subsystem for caller's workspace over rolling window, computes total and per-subsystem rates, returns {total, by_subsystem[], window, generated_at}. Tests verify aggregation math, time-window filtering with cutoff validation, empty-window handling (total=0), and workspace isolation via workspace_id == filter.
Widget Engagement Endpoint
orchestrator/api/analytics_real.py (widget endpoint), orchestrator/tests/test_widget_engagement_endpoint.py
GET /api/analytics/widget-engagement resolves workspace site IDs upfront for tenant isolation, aggregates WidgetEventLog by event_type (restricted to WIDGET_EVENT_TYPES) over 7d window, counts distinct sessions, short-circuits to empty payload when workspace has no sites. Tests validate event-type grouping, rolling-window cutoff filtering, workspace/site dual-layer scoping (Site.workspace_id filter and WidgetEventLog.site_id IN restriction), and empty-result behavior.
Activation Rate Endpoint
orchestrator/api/analytics_real.py (activation endpoint), orchestrator/tests/test_activation_endpoint.py
GET /api/analytics/activation computes platform-wide activation metric: counts distinct OrchestrationRun.workspace_id filtered to RunState.COMPLETED.value (canonical enum, not other states), divides by total Workspace count (no workspace filter—intentionally platform-scoped), returns {activated, total_workspaces, rate, generated_at}. Tests verify rate computation, zero-denominator handling (rate=0), and completed-state filter validation.
Analytics Engine Real Metrics Cleanup
orchestrator/core/services/analytics_engine.py, orchestrator/tests/test_analytics_engine_real_metrics.py
Removes placeholder fields (successRate, avgExecutionTime, totalTokensUsed, recentExecutions) from _get_agent_metrics; replaces placeholder get_agent_analytics response with minimal {agentId, period} payload and updated docstring. Tests assert placeholder fields are gone, hardcoded numeric values (e.g., 85.0) are absent, and workflow success rate is computed from real orchestration/task verification counts instead of constants.

Ralph Build Automation & Wave 0 Completion

Layer / File(s) Summary
Ralph Build Prompt and Loop Runner
scripts/ralph/PROMPT_build_prd142.md, scripts/ralph/loop-prd142.sh
PRD-142 Wave 0 prompt specifies build constraints (workspace/branch lock), backend story scope (US-001–002, 004–006), 4-phase loop (orient → implement → validate → flip passes → commit), and coding conventions (testing-first, endpoint scoping, migration safety). Bash loop script orchestrates repeated claude invocations with streaming JSON parsing, detects usage/rate-limit errors with reset-aware waiting, applies exponential backoff on other failures, and halts on completion/blocked/abort signals or max-iteration cap.
Wave 0 Completion Status Updates
scripts/ralph/prd-142-wave0.json
User stories US-001–US-005 marked passes: true with DONE notes: US-001 documents error_events sink (migration/model/tests); US-002 documents errors-by-subsystem endpoint with window/grouping; US-003 documents placeholder cleanup; US-004 documents widget-engagement endpoint with site scoping; US-005 documents activation endpoint with completed-state filtering and platform-level scope.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • AutomatosAI/automatos-ai#384: Introduces initial structured record_error telemetry logging to automatos.errors logger; this PR extends it by adding optional database persistence sink for error events.

Poem

🐰 Errors now recorded in the database deep,
Analytics awake from their metric sleep,
By subsystem, by widget, by activation's gleam—
Ralph builds the dashboards we all dream!
No more placeholders, just truth and light,

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ralph/prd-142-wave0-measurement

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AutomatosAI AutomatosAI merged commit 3335dc4 into main May 30, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants