events: denormalize kind onto event_tags so tag-CTE can pre-filter by kind#24
Conversation
… kind
For NIP-29 group reads (`{kinds: [9,11,12], #h: ['<group>']}`), the
materialized tag CTE filters by tag only — it can't see the kind
predicate. Hot groups whose tag rows are dominated by membership
events (kinds 9000/9021) feed ~95k useless rows into the hash-join
just to throw them away on the kind filter, dominating CPU on the
read path.
Production observation that motivated this: 50k users × ~2 membership
events per join (9021 + 9000) makes the `h='general'` tag set ~98k
rows, of which only ~1.4k are actual chat. Each subscription read
pays for the full 98k. PI top SQL showed two query shapes accounting
for ~65% of all DBLoad on db.m6g.large pegged at 99.7% CPU. Issue #23.
Changes:
- event_tags gets a nullable `kind` column. Fresh schemas have it via
CREATE TABLE; existing schemas pick it up via migration 002.
- saveEventWith writes the kind alongside (event_id, key, value).
Batch size stays at 15000 (60000 params, under 65535 cap) to keep
the kind-39002 SERIALIZABLE critical-section economy intact.
- buildSelectQuery pushes `kind IN (...) OR kind IS NULL` into each
per-tag CTE branch when filter.Kinds is set. The IS NULL branch
keeps reads correct on un-backfilled rows during the production
rollout window. Drop it in a follow-up once backfill is verified.
- Init splits the schema-bootstrap into pre-migrate and post-migrate
phases. The covering index `(key, value, kind, event_id)` lives in
post-migrate so it runs only after migration 002 has added the
column on existing schemas — avoiding the "column does not exist"
failure on the production upgrade path.
Migration scope is intentionally minimal (ADD COLUMN only). The
backfill UPDATE and CREATE INDEX CONCURRENTLY are documented as
manual one-shot ops because the auto-migration runner has a 30s
per-statement deadline and CONCURRENTLY can't survive that without
leaving an INVALID index that IF NOT EXISTS would silently skip.
Tests:
- TestEventStore_SaveEvent_PopulatesTagKind — verifies kind populates
on new inserts.
- TestEventStore_QueryEvents_TagAndKindCTE — saves 3 chat + 5
membership events tagged with the same group; asserts the kind
filter excludes membership at the CTE.
- TestEventStore_Init_UpgradeFromPre002 — drops the kind column and
the migration marker on a freshly-initialized schema, then re-runs
Init() and asserts the column and post-migrate index both come
back. This is the test that catches the pre-migrate/post-migrate
ordering bug — fresh-schema tests can't, because CREATE TABLE on a
fresh schema makes the column from the start.
- TestEventStore_QueryEvents_TagKindNullCompat — clears kind to NULL
on a saved event's tag rows and asserts the OR IS NULL branch keeps
it visible.
Fixes #23.
There was a problem hiding this comment.
Code Review
This pull request optimizes tag-based queries by denormalizing the kind column into the event_tags table and introducing a new covering index. The changes include updates to the schema initialization, event saving logic, and the query builder to improve performance for groups with many membership events. Feedback focuses on improving the database migration safety by using concurrent index creation, removing a now-redundant index, refactoring duplicated kind-conversion logic, and ensuring errors from the SQL builder are properly handled.
There was a problem hiding this comment.
Pull request overview
This PR denormalizes events.kind onto event_tags so the tag-filter MATERIALIZED CTE can pre-filter by kind, significantly reducing work for hot NIP-29 group-history queries dominated by membership-event tag rows (issue #23).
Changes:
- Add nullable
event_tags.kindvia migration 002 and populate it on the write path (saveEventWith). - Push
kind IN (...) OR kind IS NULLinto each per-tag CTE branch whenfilter.Kindsis set, preserving correctness during backfill rollout. - Split
EventStore.Init()into pre-/post-migrate phases and add integration tests covering insert population, upgrade ordering, and NULL-compat behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
zooid/migrations/002_event_tags_kind.sql |
Adds nullable kind column to __event_tags and documents rollout considerations. |
zooid/events.go |
Writes kind into event_tags, pushes kind predicate into tag CTE branches, and adds a post-migrate covering index creation step. |
zooid/events_test.go |
Adds integration tests for tag-kind population, kind/NULL compatibility, and pre-002 upgrade path behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- gemini: hoist `kindInts` conversion out of the CTE block to function scope and reuse it in the outer events filter — was duplicated. - gemini: error-check `subQ.ToSql()` instead of dropping the error. Build-state failures here are programmer errors (malformed squirrel.Select), so panic loudly rather than silently appending an empty CTE part. - Copilot: strengthen TestEventStore_QueryEvents_TagAndKindCTE to assert on the generated SQL shape — the result-level assertion alone would still pass with the kind pushdown removed (the outer events.kind filter does the real work). Now we slice out the CTE text and require it to contain a `kind` predicate, so the optimization itself is guarded against regression. - Copilot: update migration 002 comment to reflect what actually happens — the post-migrate `CREATE INDEX IF NOT EXISTS` in Init() is a no-op on production after the documented dbops one-shot CONCURRENTLY create has run, fast on fresh schemas. The migration file no longer claims the index is "intentionally not applied" by any startup path. Not taken: gemini's suggestion to drop the now-prefix-redundant `(key, value, event_id)` index. The new `(key, value, kind, event_id)` covers it via prefix matching, so dropping is safe in principle, but removing a hot-path index in the same change as adding the new one mixes two reversibility profiles. Will do as a follow-up after the new index is verified in production.
- Copilot: stop running CREATE INDEX in Init() — move it into migration 002. Migration runs once per schema, with the runner's 30s per-statement deadline acting as a fail-fast on any production schema where the index wasn't pre-built via the dbops CONCURRENTLY step. With the prep done, IF NOT EXISTS no-ops the migration's CREATE INDEX. Without prep, the migration fails loudly and the relay refuses to start with an actionable error. The pre-/post-migrate split in Init() goes away with this — there's nothing post-migrate left to run. - Copilot: stop panicking on `subQ.ToSql()` failure in `buildSelectQuery`. The function now returns `(squirrel.SelectBuilder, error)`. Both callers (`queryEventsWith`, `CountEvents`) handle the error and return / log instead of crashing the goroutine. - Copilot: replace the brittle exact-substring slicing in TestEventStore_QueryEvents_TagAndKindCTE with a regex `(?s)MATERIALIZED.*?\\bkind\\b.*?\\)\\s+SELECT\\b`. Both lazy quantifiers anchor to the nearest CTE-close-and-SELECT, so an outer-only `e.kind` filter doesn't satisfy it (there is no `) SELECT` after the outer query's kind). Tolerant of squirrel whitespace/formatting drift.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
zooid/events.go:299
buildSelectQuerynow returns(squirrel.SelectBuilder, error), butzooid/query_performance_test.gostill calls it as if it returned only a builder (e.g., inexplainAnalyze). This will break compilation/tests; update that helper (and any other remaining call sites) to handle the new error return.
func (events *EventStore) buildSelectQuery(filter nostr.Filter) (squirrel.SelectBuilder, error) {
eventsTable := events.Schema.Prefix("events")
eventTagsTable := events.Schema.Prefix("event_tags")
// Collect valid single-letter tag filters and sort for deterministic SQL.
- gemini: hoist `kindInts` conversion out of the CTE block to function scope and reuse it in the outer events filter — was duplicated. - gemini: error-check `subQ.ToSql()` instead of dropping the error. Build-state failures here are programmer errors (malformed squirrel.Select), so panic loudly rather than silently appending an empty CTE part. - Copilot: strengthen TestEventStore_QueryEvents_TagAndKindCTE to assert on the generated SQL shape — the result-level assertion alone would still pass with the kind pushdown removed (the outer events.kind filter does the real work). Now we slice out the CTE text and require it to contain a `kind` predicate, so the optimization itself is guarded against regression. - Copilot: update migration 002 comment to reflect what actually happens — the post-migrate `CREATE INDEX IF NOT EXISTS` in Init() is a no-op on production after the documented dbops one-shot CONCURRENTLY create has run, fast on fresh schemas. The migration file no longer claims the index is "intentionally not applied" by any startup path. Not taken: gemini's suggestion to drop the now-prefix-redundant `(key, value, event_id)` index. The new `(key, value, kind, event_id)` covers it via prefix matching, so dropping is safe in principle, but removing a hot-path index in the same change as adding the new one mixes two reversibility profiles. Will do as a follow-up after the new index is verified in production.
Fixes #23.
Why
Production groupchat relay (`sphere-zooid-relay-eu`) is at 99.7% CPU on db.m6g.large because two NIP-29 group-history queries account for ~65% of all DBLoad. The materialized tag CTE filters by tag only — it can't see the `kind IN (...)` predicate.
Event distribution in production:
So for a hot group like `general`: ~97k events tagged `h='general'` (95k membership + 1.4k chat). A subscription read for `{kinds: [9,11,12], #h: ['general']}` materializes 97k tag rows, hash-joins, throws away 95% on the kind filter, sorts, takes 1000. ~1.1s per query (verified by EXPLAIN ANALYZE).
50k users × ~2 membership events per join is what tipped the balance. Same query was fine at 45k users.
What
The new index creation and the historical-row backfill are intentionally not in the auto-migration. The runner enforces a 30s per-statement deadline; `CREATE INDEX CONCURRENTLY` cannot run inside a transaction and leaves an INVALID index on timeout that `IF NOT EXISTS` would silently skip on retry. So those run as one-shot ops via the dbops task — see Rollout below.
Once the CTE returns 1.4k rows instead of 97k for hot groups, expected per-query cost drops from ~1.1s to ~10–50ms.
Verification
`go test -timeout 300s ./...` passes; `gofmt -l -s .` clean. Four new integration tests:
Rollout (production schema with 438k existing tag rows)
Run these from the dbops task before merging this PR, so the column and index exist before the new code path lands:
```sql
-- 1. Add column (instant, metadata-only on PG 11+).
ALTER TABLE sphere_relay__event_tags ADD COLUMN IF NOT EXISTS kind INTEGER;
-- 2. Backfill historical rows. Idempotent — re-runnable if interrupted.
UPDATE sphere_relay__event_tags et
SET kind = e.kind
FROM sphere_relay__events e
WHERE et.event_id = e.id AND et.kind IS NULL;
-- 3. Build the new covering index (does NOT block writes).
CREATE INDEX CONCURRENTLY IF NOT EXISTS sphere_relay__idx_event_tags_key_value_kind_event_id
ON sphere_relay__event_tags(key, value, kind, event_id);
```
After merge, `Init()` will see the column and index already exist (IF NOT EXISTS is idempotent), the migration marker will be set on first run, and the new CTE shape goes live with full plan support.
Follow-ups (not in this PR)