Skip to content

feat(2.4.0): community ad patterns + 49-tag vocabulary + authoritative sponsor seed#224

Merged
ttlequals0 merged 16 commits into
mainfrom
feat/community-patterns-tagging
May 15, 2026
Merged

feat(2.4.0): community ad patterns + 49-tag vocabulary + authoritative sponsor seed#224
ttlequals0 merged 16 commits into
mainfrom
feat/community-patterns-tagging

Conversation

@ttlequals0
Copy link
Copy Markdown
Owner

Summary

  • Community ad patterns. Patterns can be shared via patterns/community/ in the repo; instances opt in to a cron-driven auto-pull, and a new "Submit to community" button on each local pattern opens a prefilled GitHub PR.
  • 49-tag vocabulary. Sponsors and podcasts now carry tags; community patterns only enter the matching loop when sponsor/podcast tags overlap (or the sponsor has universal, or either side is empty). Local + imported patterns bypass the tag check.
  • Authoritative sponsor seed. Schema migration loads 255 sponsors from src/seed_data/sponsors_final.csv, preserving FKs (UPDATE-by-name, soft-delete orphans).
  • Reviewer-trim auto-rewrite. When a reviewer narrows an ad's bounds beyond the configurable threshold, the local pattern's text is re-extracted from the new transcript window. Community patterns are never auto-rewritten.
  • GitHub Action validates each community PR with the same gates the in-app export uses + a three-tier dedupe (95%+ duplicate, 75-95% variant, <75% distinct).

See CHANGELOG.md for the full per-section breakdown.

Test plan

  • pytest tests/ — 1159 passed, 4 skipped (146 new tests added)
  • npm run lint && npm run build — green
  • /simplify reviewer pass — PATTERN_SOURCES constants centralized, bulk-op handler extracted, indexes promoted to SCHEMA_SQL, source filter pushed into SQL, N+1 batched in apply_manifest, set_podcast_tags short-circuits when unchanged, time helpers reused
  • CI green (CodeQL, pip-audit, npm-audit, python-tests, frontend-build)
  • Manual: spin up dev server, verify itunes:category parsing populates podcasts.tags, verify Submit-to-community opens the prefilled PR URL, verify protect-from-sync auto-engages when a community pattern is edited
  • /code-review pass on this PR

Out of scope (deferred to a future iteration)

Per plan section 16: auto-apply variant merges in CI; per-source sync schedules; auto-cleanup of unused patterns; non-English stopword lists; LLM-suggested tags; submitter identity beyond submitted_app_version; auto-regeneration of index.json on merge.

…e sponsor seed

Adds crowdsourced ad pattern sharing for MinusPod. Patterns can be shared
via patterns/community/ in the repo, auto-pulled on a configurable cron,
and filtered against each podcast by a shared tag vocabulary so a
pattern for "Squarespace" never enters the matching loop on a podcast
tagged only kids_family. Reviewer-time bound adjustments now feed back
into pattern text on a configurable threshold. A GitHub Action validates
community PRs against the same gates and a three-tier dedupe before
merge.

Why now: today patterns are local-only and grow per-instance; users with
low pattern counts get poor coverage. The schema, matcher, and reviewer
paths already existed but assumed single-instance ownership. This makes
patterns shareable without sacrificing the per-user customizations the
reviewer flow already accumulates.

Schema (migration is additive + idempotent):
- ad_patterns: source, community_id, version, submitted_app_version, protected_from_sync
- known_sponsors: tags (JSON array)
- podcasts: tags, user_tags
- episodes: tags
- New indexes idx_patterns_source + idx_patterns_community_id
- Sponsor reseed from src/seed_data/sponsors_final.csv (255 entries, authoritative)
  Preserves FKs by UPDATE-by-name; soft-deletes orphans (is_active=0)

Backend:
- TextPatternMatcher: tag-eligibility filter applied only to community patterns
- PatternService.rewrite_pattern_from_bounds + import_community_pattern
- community_export.py: quality gates, PII strip, sponsor classification,
  prefilled GitHub PR URL with 7KB fallback
- community_sync.py: cron-driven manifest fetch + apply (INSERT/UPDATE/DELETE
  respecting protected_from_sync)
- tools/community_pattern_validator.py: CLI + library for CI validation
- RSS itunes:category parsing wired into refresh path
- New /api/v1/ endpoints for bulk ops, submit-to-community, protect/unprotect,
  feed tags, sponsor tags, reviewer settings, community sync, sync status

Frontend:
- PatternsPage: source filter, Import/Export header, Submit-to-community
  + Protect-from-sync row actions, community badge, last-synced indicator
- TagChips, CommunityBadge, PatternImportDialog components
- AdReviewerSection + CommunityPatternsSection settings panels

CI / repo:
- .github/workflows/validate-community-patterns.yml runs the validator and
  posts a Markdown comment on each community PR
- .github/labeler.yml auto-labels community PRs with `pattern`
- patterns/community/ with README, empty index.json, examples/

Tests: +146 new tests (unit + integration). Whole suite is green (1159 passed,
4 skipped). /simplify pass applied: PATTERN_SOURCES constants centralized,
bulk-op handler extracted, indexes promoted to SCHEMA_SQL, source filter
pushed into SQL, N+1 batched in apply_manifest, set_podcast_tags
short-circuits when unchanged, time helpers reused.

Closes the work tracked in IMPLEMENTATION_PLAN.md (delivered via
pastebin sloth-fox-spider).
Comment thread src/api/patterns.py Fixed
Comment thread src/api/patterns.py Fixed
@ttlequals0
Copy link
Copy Markdown
Owner Author

Code review

Found 1 issue:

  1. community_sync.py:105 — comment says "Stamp version from manifest entry (overrides any in data)" but the code uses dict.setdefault('version', manifest_version), which does NOT override an existing value. If a manifest entry's inner data dict carries an outdated version, the manifest's top-level version is silently ignored, causing the version-gate in pattern_service.import_community_pattern to compare the wrong number.

for community_id, data, manifest_version in valid_entries:
existing = existing_by_cid.get(community_id)
# Stamp version from manifest entry (overrides any in data)
data_with_version = dict(data)
data_with_version['community_id'] = community_id
data_with_version.setdefault('version', manifest_version)
try:
if existing is None:
pattern_service.import_community_pattern(data_with_version)

Suggest assigning unconditionally (data_with_version['version'] = manifest_version) so behavior matches the documented contract.

Generated with Claude Code

If this code review was useful, please react with 👍. Otherwise, react with 👎.

ttlequals0 added 12 commits May 14, 2026 18:24
- community_sync.apply_manifest: assign `version` from the manifest entry
  unconditionally instead of `setdefault`. The comment promised
  override semantics; the code had `setdefault`, which silently kept a
  stale `version` carried inside the inner `data` dict and broke the
  version-gate in `pattern_service.import_community_pattern`.
- schema._create_new_tables_only: bring inline CREATE TABLE blocks for
  `ad_patterns` (source, community_id, version, submitted_app_version,
  protected_from_sync) and `known_sponsors` (tags) back in sync with
  SCHEMA_SQL. End state was already correct via the ALTER TABLE
  migrations, but the "must match SCHEMA_SQL exactly" comment was no
  longer accurate; future readers would have been misled.
- schema._run_schema_migrations: defer `_reseed_known_sponsors` to AFTER
  `_migrate_sponsor_fk` + the Zyn cleanups. On a v2.1.x -> 2.4.0 jump
  the prior ordering tagged a case-variant row before the FK migration
  deduped case-variants, which could discard the freshly-tagged row.
  The reseed now operates on the canonical post-FK-migration state.

No new tests required - the existing migration and sync tests already
cover idempotency, version-gating, and FK preservation, and all 1159
tests still pass.
CodeQL flagged two `py/reflective-xss` high-severity alerts on the new
bulk-delete / bulk-disable endpoints because user-supplied `ids` and
`expected_count` were reflected back in the JSON response without
type-coercion. Both responses are emitted via `jsonify` (so the
real-world XSS risk is bounded), but accepting arbitrary types from
input is also bad input validation — the call should hard-reject
non-integer ids and non-integer expected_counts.

In _resolve_bulk_target:
- `expected_count` is now cast to int up-front; non-int payloads return
  400 instead of being f-stringed into the error response.
- User-supplied `ids` are cast element-wise to int; non-int contents
  return 400 instead of being passed to bulk_delete / bulk_disable
  (which would then have failed downstream with an opaque SQL error).

No test changes — the new validation tightens accepted input shape
without changing legitimate-caller behavior.
…rate workflow

Restructures the patterns/ directory to match the documentation plan:
docs at the patterns/ root, pattern JSON files (and the manifest) under
patterns/community/. patterns/community/README.md is gone, replaced by
the technical reference at patterns/README.md.

New files:
- patterns/CONTRIBUTING.md — submitter-facing PR explainer (stripped
  fields, PII rules, quality gates, what the validator does)
- patterns/README.md — technical reference (sync mechanics, manifest +
  pattern file formats, tag vocabulary, reviewer workflow, ops)
- patterns/vocabulary.json — machine-readable copy of the 49-tag
  vocabulary; generated from src/seed_data/tag_vocabulary.csv plus the
  hardcoded UNIVERSAL_TAG
- .github/workflows/regenerate-manifest.yml — on push to main that
  touches patterns/community/**.json, rebuilds index.json and commits
  it back. Concurrency-gated so back-to-back PR merges don't race.
- src/tools/generate_manifest.py — companion module the workflow runs;
  also invocable locally via `python -m src.tools.generate_manifest`.

Updates:
- root README.md — replace the existing community section with the
  user-facing "Community Patterns (Optional)" section (opt-in framing,
  what you get / control / share, links to deeper docs). TOC entry
  retargeted.
- .github/workflows/validate-community-patterns.yml — invoke validator
  via `python -m src.tools.community_pattern_validator` instead of
  setting PYTHONPATH=src, matching the new generate_manifest call shape.
- src/tools/community_pattern_validator.py — Markdown comment now links
  back to the Quality checks / Dedupe / sponsor-add sections of the
  new docs, so submitters self-serve on what failed and why.

Stale references in patterns/README.md were fixed during the
/humanizer pass: the sponsor seed source is now
`src/seed_data/sponsors_final.csv` (loaded by the 2.4.0 migration), not
the deprecated `SEED_SPONSORS` constant; the vocabulary source is
`src/seed_data/tag_vocabulary.csv` (read by `src/utils/community_tags.py`),
not the never-introduced `VALID_TAGS` Python constant.

All 1138 tests pass (test_api.py:21 errors are a pre-existing local
permission issue against /app, unrelated to this PR; CI run on prior
commits has these tests passing).
Audited patterns/CONTRIBUTING.md, patterns/README.md, root README's
Community Patterns section, and CHANGELOG.md 2.4.0 entry against every
rule in the /humanizer skill (Wikipedia "Signs of AI writing"). Edits:

- 13 curly quotes/apostrophes replaced with straight in the two patterns/
  docs (rule 15)
- CONTRIBUTING.md Sponsor-validation section: three-bullet bold-label
  em-dash list converted to a prose paragraph (rules 13, 14)
- patterns/README.md Pattern file Fields list: ten em dashes replaced
  with ' - ' separators, matching the README Experiments-section style
  already used elsewhere in the repo (rule 13)
- patterns/README.md Tag categories: ' — ' inside the section preamble
  replaced with '; '; the Special-tag em dash replaced with a
  parenthetical (rule 13)
- patterns/README.md Podcast tagging: three-bullet bold-label em-dash
  list converted to prose (rules 13, 14)
- patterns/README.md API list: two em dashes replaced with ' - '
- root README "What you control" list: four em dashes replaced with
  ' - ', matching the project's existing labeled-list style

Final em-dash count in all four touched docs: 0. Curly-quote count: 0.
No emojis, no AI-vocabulary words (delve/leverage/comprehensive/robust/
etc.), no inflated symbolism, no negative parallelisms. Content
unchanged; only stylistic AI-tell cleanup.
The backend already exposed GET / PUT /feeds/{slug}/tags but no UI
surfaced them, so user-added tags couldn't be set without curl. Adds
the missing pieces:

- GET /api/v1/tags/vocabulary - new endpoint returning the canonical
  49-tag vocabulary plus per-tag descriptions, grouped into
  podcast_genres + sponsor_industries + special_tags. The frontend
  picker needs the descriptions for tooltips and the grouping for the
  <optgroup>s in the add-tag dropdown.

- frontend/src/components/FeedTagsEditor.tsx - new component. Shows
  effective tags grouped by source (RSS / episode / user), each layer
  rendered as TagChips. User-added tags have an X button to remove
  them. An "+ Add tag" button opens a grouped <select> of the
  remaining vocabulary tags; selection auto-saves via
  setFeedUserTags.

- frontend/src/api/community.ts - getTagVocabulary() + TagVocabulary
  type.

- frontend/src/pages/FeedDetail.tsx - mounts the editor as a card
  directly above the Episodes section.

No tests added for the React component (project doesn't have a Jest /
RTL suite); the existing GET/PUT feed-tag API has integration coverage
in test_community_pattern_flow.py.
A side-by-side audit against IMPLEMENTATION_PLAN.md section 10 found
two UI surfaces that were partially built:

1. Submit-to-community / Protect-from-sync row actions existed in the
   mobile card layout but were missing from the desktop table. Added a
   ninth "Actions" column with the per-row buttons (Submit for local,
   Protect/Unprotect for community). Rebalanced colgroup widths so the
   table doesn't blow past 100%.

2. The Export control was a plain <a> that downloaded the whole local
   pattern DB; the plan called for "Opens a modal with a multi-select
   pattern list" (section 10, line 291). Added PatternExportDialog with
   a checkbox per row, Select-all toggle, optional include-disabled /
   include-corrections flags, and download via the existing
   /patterns/export endpoint. Extended that endpoint to accept
   ?ids=1,2,3 for selecting a subset.

The dialog initializes its selection from the currently-filtered
patterns prop. To stay clean of the React Compiler's
preserve-manual-memoization rule, the outer component remounts the
inner implementation on each open so useState's initializer re-syncs;
no useEffect prop->state writes.

Both buttons in the desktop table use the existing handlers
(handleSubmitToCommunity / handleToggleProtect) shared with the mobile
card path, so behavior is identical across breakpoints.
Findings from a /simplify audit of commits 82ff17e..8aa59b7 (the
changes made after the initial 2.4.0 simplify pass):

- Tag-vocabulary CSV was being parsed in three places (utils helper,
  one-shot vocabulary.json generator, and the new /tags/vocabulary
  endpoint). Extracted `utils.community_tags.vocabulary_payload()` and
  cached it with `@lru_cache(maxsize=1)`. The endpoint and the
  patterns/vocabulary.json regenerator now share one source of truth;
  vocabulary cannot drift between them.

- Endpoint discoverability: `/tags/vocabulary` was on sponsors.py
  because that's where the other tag-CRUD routes lived, but the
  endpoint has no sponsor coupling — moved to a new src/api/tags.py
  and registered it in the Blueprint import line. A frontend dev
  hunting "where is the tag vocab endpoint" will now find it on the
  first grep.

- Stringly-typed pattern sources: the TS side had `'local' |
  'community' | 'imported'` literals at ~9 sites. Added
  `PATTERN_SOURCES` const + `PatternSource` type in api/patterns.ts
  (mirrors the existing PATTERN_SOURCES frozenset in
  utils/community_tags.py). Existing call sites still typecheck
  unchanged.

- Vocabulary React Query staleTime: was 1h, but the vocabulary ships
  with the app image and can't change at runtime. Bumped to Infinity
  + gcTime: Infinity so we fetch once per page load, never refetch on
  focus.

- tools/ sys.path bootstrap: created src/tools/__init__.py so the
  workflow-style `python -m src.tools.X` path explicitly wires src/
  onto sys.path through the package init. The per-script
  sys.path.insert lines remain as a defensive fallback for direct
  `python script.py` invocation; they short-circuit when __init__.py
  has already done the work.

- Duplicate WHY-comment in schema.py (one at the position the reseed
  call _used to_ live in the migration block, one at the call site
  after Zyn cleanups). Kept the call-site comment, trimmed the upstream
  one to a single-line pointer.

- Dropped two narrating WHAT-comments (FeedDetail's
  "Feed-level tag editor" + PatternExportDialog's remount-trick
  re-explainer). The identifiers and JSX already say it.

All 1138 backend tests still pass; frontend lint + build clean.
2.4.0 already shipped (Docker tag pushed, prod deployed). Subsequent
commits 82ff17e..097a8e6 added material runtime changes that should
not have ridden the same tag:

- FeedTagsEditor + GET /tags/vocabulary endpoint (c0c0680)
- Multi-select PatternExportDialog + /patterns/export ?ids= filter (8aa59b7)
- Per-row Submit / Protect buttons in the desktop pattern table (8aa59b7)
- Vocabulary caching + endpoint relocation (097a8e6)
- TS PATTERN_SOURCES const (097a8e6)
- community_sync version-stamp fix, CodeQL bulk-op coercion,
  SCHEMA_SQL drift fix, sponsor-reseed reorder (82ff17e, 4969436)

Bumping to 2.4.1 so the registry tag matches the running code. CHANGELOG
entry summarizes what's new in 2.4.1 vs 2.4.0 across Added / Changed /
Fixed. openapi.yaml info.version also bumped.

Going forward: every rebuild that ships new app code gets a fresh
versioned tag.
External audit (pastebin otter-wasp-eel) flagged 5 gaps after I merged
the v2.4.1 bump. All 5 fixed:

1. **Reviewer-trim now actually trims** instead of full-replacing the
   template. `rewrite_pattern_from_bounds` takes original + new bounds,
   computes the head/tail transcript slices, and splices them out of
   the existing template only when they appear at its start/end. The
   prior "Operation 2 full replace" risk (fitting the template to one
   episode's transcription) is gone. intro_variants / outro_variants
   get the same prefix/suffix trim treatment so they stay aligned.
   Returns False when neither head nor tail trim matches the template
   — defensive bail-out for misaligned input.

2. **Labeler workflow added.** `.github/labeler.yml` had the path-glob
   for `pattern` but no workflow invoked actions/labeler.
   `.github/workflows/labeler.yml` now wires it under
   `pull_request_target`.

3. **`vocabulary_version` is now read by the sync job.** sync_now
   compares `manifest['vocabulary_version']` against the app's value
   and emits a log warning + `summary['vocabulary_warning']` on
   mismatch. Informational only — vocabulary still ships with the app
   image. Bad / non-numeric values get a clean log warning instead of
   crashing the sync.

4. **`GITHUB_REPO` is now a single constant.** `community_export.py`
   and `community_sync.py` were hand-rolling separate copies of the
   upstream identity. New `GITHUB_REPO` + `COMMUNITY_MANIFEST_URL` in
   `utils/community_tags.py`; both call sites import from there.

5. **`extract_transcript_segment` no longer imported from api/.**
   `pattern_service.py` now imports `extract_text_in_range` from
   `utils.text` directly — the previous `from api import ...` was the
   wrong dependency direction (service layer leaning on api layer).

Then /simplify pass over the gap fixes surfaced four more cleanups:

- `VOCABULARY_VERSION` + `MANIFEST_VERSION` were defined in
  `tools/generate_manifest.py` (a build-time CLI) and the runtime
  `community_sync.py` was importing them across that wrong-direction
  boundary. Moved both to `utils/community_tags.py` where the
  vocabulary lives. Both `generate_manifest.py` and `community_sync.py`
  now import from there. `vocabulary_payload()` also uses the constant
  instead of a hardcoded literal `1`.
- The broad `except Exception` around the version check is now scoped
  to `(TypeError, ValueError)` on the int cast.
- Dropped `MANIFEST_URL = COMMUNITY_MANIFEST_URL` (pure indirection
  alias, no external consumer).
- Dropped local `import json as _json` inside
  `rewrite_pattern_from_bounds` — module-level `import json` already
  in scope.
- Extracted `_splice_prefix` / `_splice_suffix` module helpers in
  `pattern_service.py`. Template trim and the two variant-array
  rewrites all call them — duplicated whitespace+case-insensitive
  splice logic became one tested pair.

1139 backend tests pass (one more than before from the new trim test);
frontend lint + build clean.
Pulled from grafana/loki on services03 while debugging a 14-minute
pass-1 run on cordkillers-only-audio episode 98743f657890.

1. **Fingerprint slow-fallback timeout** — the full-file fpcalc call
   failed with "Invalid data found when processing input" (audio file
   has bytes fpcalc rejects but Whisper tolerates). The fallback then
   inherited the full 600-second timeout for per-window scanning,
   which uses the same fpcalc binary and produced 0 matches over 10
   minutes. New `FALLBACK_SLOW_TIMEOUT = 90` caps the fallback. Stage
   1 now skips fast on broken audio instead of stalling.

2. **`processing_timeouts._resolve` broken DB import** — imported
   `get_database` from the `database` package, but it actually lives
   in `api`. The ImportError was swallowed by the try/except, so
   user-configured `processing_soft_timeout_seconds` /
   `processing_hard_timeout_seconds` settings were silently shadowed
   by env-var / default fallbacks. Repaired the import path.

3. **`community_sync` 404 noise** — every 15-min background tick was
   logging a WARNING because the manifest URL points at main but
   `patterns/community/index.json` only exists on this feature
   branch. Downgraded the 404 case to INFO with an explanatory
   message; non-404 fetch failures still WARN.

4. **`set_podcast_tags` skips episode aggregation** when the incoming
   RSS tags are already a subset of the row's current union and
   user_tags isn't being touched. The dominant case on the refresh
   hot path (cordkillers refreshes had 338 episodes -> 338 JSON
   parses every 15 min for nothing).

All 1139 tests still green. None of these required schema changes.
…tern validator, 12 seed patterns

- Move per-pattern Submit-to-community into the Export dialog as a destination radio. Multi-select drives N prefilled-PR tabs in one click; community patterns are auto-excluded.
- New DELETE /api/v1/community-patterns/all endpoint plus a Settings button to wipe every source=community pattern (and its audio_fingerprints rows). Requires confirm:true.
- PR validator now rejects multi-sponsor blocks. Mirrors the export-side foreign-sponsor check; both now share find_foreign_sponsors() in community_export.py.
- Seed patterns/community/ with 12 curated patterns from a real instance export (Capital One, Carvana x2, Instacart, Kayak, Mint Mobile, Monday.com, Progressive, SimpliSafe, Squarespace, ThreatLocker, Zyn).
- Popup-blocker fix: open N blank tabs synchronously at click, redirect each once its submit returns.
- Misc: PATTERN_SOURCE_* named consts, downloadBlob reuse, ASCII docstrings, ['communitySync'] cache invalidation on purge, 4 new validator unit tests.
@github-actions
Copy link
Copy Markdown

Community pattern validation

Rejected (12)

See Quality checks and Dedupe for what each gate enforces.

  • patterns/community/capital-one-44026a0f.json (sponsor: Capital One)
    • duplicates 44026a0f-9d7a-4f39-815e-5b12575b8107 (score=1.00)
  • patterns/community/carvana-a934bf9a.json (sponsor: Carvana)
    • duplicates a934bf9a-ebcc-4d64-b734-b82a6de7f25a (score=1.00)
  • patterns/community/carvana-f41cf617.json (sponsor: Carvana)
    • duplicates f41cf617-6ad0-4df5-a07a-a4daeb449697 (score=1.00)
  • patterns/community/instacart-b77d02bc.json (sponsor: Instacart)
    • duplicates b77d02bc-5ccb-465d-8ecc-c0f3c53efa3d (score=1.00)
  • patterns/community/kayak-3f2f65ff.json (sponsor: Kayak)
    • duplicates 3f2f65ff-c654-42ca-a724-d6dce1f45881 (score=1.00)
  • patterns/community/mint-mobile-17b2b4b8.json (sponsor: Mint Mobile)
    • duplicates 17b2b4b8-660e-4ab8-96c0-c51a0dc3163d (score=1.00)
  • patterns/community/monday-com-9e83a5f6.json (sponsor: Monday.com)
    • duplicates 9e83a5f6-097b-4378-8077-b9ccec49b389 (score=1.00)
  • patterns/community/progressive-1c07273d.json (sponsor: Progressive)
    • duplicates 1c07273d-4e1c-41fd-93dd-0455904ada87 (score=1.00)
  • patterns/community/simplisafe-c114bd7f.json (sponsor: SimpliSafe)
    • duplicates c114bd7f-39af-4100-826d-e9ed1346d76f (score=1.00)
  • patterns/community/squarespace-b052ed12.json (sponsor: Squarespace)
    • duplicates b052ed12-557d-4cdb-b24a-28d547dbd9bd (score=1.00)
  • patterns/community/threatlocker-6b1b16df.json (sponsor: ThreatLocker)
    • duplicates 6b1b16df-70f4-4cd2-9e3c-18d2559b3195 (score=1.00)
  • patterns/community/zyn-3c348177.json (sponsor: Zyn)
    • duplicates 3c348177-fdf5-4f4f-a3de-25f4e65bf591 (score=1.00)

See patterns/CONTRIBUTING.md for the full submission guide.

CI checks out the PR branch, so files added by the PR are already on disk
in patterns/community/ at run time. The validator's _load_existing_patterns()
sweep was picking them up, then dedupe() compared each new doc against the
identical-on-disk copy and rejected with score=1.00. The 12 seed patterns
in 2.4.4 hit this on first push.

Build pr_community_ids by reading each --pr-files arg and excluding any
existing row whose community_id matches before validate_doc() runs.
Regression test added.
@github-actions
Copy link
Copy Markdown

Community pattern validation

Passed (12)

  • patterns/community/capital-one-44026a0f.json (sponsor: Capital One)
  • patterns/community/carvana-a934bf9a.json (sponsor: Carvana)
  • patterns/community/carvana-f41cf617.json (sponsor: Carvana)
  • patterns/community/instacart-b77d02bc.json (sponsor: Instacart)
  • patterns/community/kayak-3f2f65ff.json (sponsor: Kayak)
  • patterns/community/mint-mobile-17b2b4b8.json (sponsor: Mint Mobile)
  • patterns/community/monday-com-9e83a5f6.json (sponsor: Monday.com)
  • patterns/community/progressive-1c07273d.json (sponsor: Progressive)
  • patterns/community/simplisafe-c114bd7f.json (sponsor: SimpliSafe)
  • patterns/community/squarespace-b052ed12.json (sponsor: Squarespace)
  • patterns/community/threatlocker-6b1b16df.json (sponsor: ThreatLocker)
  • patterns/community/zyn-3c348177.json (sponsor: Zyn)

Validation passed. Ready for review.

See patterns/CONTRIBUTING.md for the full submission guide.

…[object Object] fix

The 2.4.4 submit-to-community flow opened one prefilled PR tab per pattern.
At scale that broke: out of 215 selected, only 8 tabs survived the popup
blocker, 20 forced JSON downloads, 187 returned 400s rendered as
'[object Object]' in the post-submit alert. None reached GitHub.

This release replaces the per-tab fan-out with a single bundle download
and adds two new endpoints to drive it:

- POST /api/v1/patterns/preview-export: dry-runs the quality gates and
  returns {ready, rejected[{id, sponsor, reasons}], counts}.
- POST /api/v1/patterns/submit-bundle: returns one downloadable JSON file
  containing every pattern that passed (format: minuspod-community-submission).
  The contributor commits it into their fork and opens one PR.

The PR-side validator and manifest builder both handle the new bundle
format natively (one validation per entry, one manifest entry per
contained pattern), so maintainers don't have to split files on merge.

Other fixes:
- /community-patterns/sync returns 200 {status: no_manifest_yet} on
  upstream 404 instead of 502 (e.g. patterns feature still on a branch).
- apiRequest's extractErrorMessage helper normalizes {error: {message,
  reasons}} bodies to a string so failures no longer render as
  [object Object] anywhere in the UI.

Simplify pass folded BUNDLE_FORMAT + iter_bundle_patterns into
utils/community_tags.py (was duplicated 3 places + 2 string literals),
added Database.get_ad_patterns_by_ids() batch helper used by build_bundle,
collapsed downloadCommunityBundle to trigger its own browser download, and
dropped the redundant stage state in PatternExportDialog (now derived).
@github-actions
Copy link
Copy Markdown

Community pattern validation

Passed (12)

  • patterns/community/capital-one-44026a0f.json (sponsor: Capital One)
  • patterns/community/carvana-a934bf9a.json (sponsor: Carvana)
  • patterns/community/carvana-f41cf617.json (sponsor: Carvana)
  • patterns/community/instacart-b77d02bc.json (sponsor: Instacart)
  • patterns/community/kayak-3f2f65ff.json (sponsor: Kayak)
  • patterns/community/mint-mobile-17b2b4b8.json (sponsor: Mint Mobile)
  • patterns/community/monday-com-9e83a5f6.json (sponsor: Monday.com)
  • patterns/community/progressive-1c07273d.json (sponsor: Progressive)
  • patterns/community/simplisafe-c114bd7f.json (sponsor: SimpliSafe)
  • patterns/community/squarespace-b052ed12.json (sponsor: Squarespace)
  • patterns/community/threatlocker-6b1b16df.json (sponsor: ThreatLocker)
  • patterns/community/zyn-3c348177.json (sponsor: Zyn)

Validation passed. Ready for review.

See patterns/CONTRIBUTING.md for the full submission guide.

…nsive parse

text_pattern_matcher passed json.dumps([intro]) into create_ad_pattern,
which json.dumps'd it again. Every auto-created pattern's variants got
stored as a JSON-string of a JSON-string. The 2.4.5 community bundle
pipeline parsed that as a string and exploded it character-by-character
(user's first bundle had intro_variants of length 196 starting with
['[', '"', 'E', 'm', ...]).

Fixes:
- Pass plain lists from text_pattern_matcher (root cause).
- _safe_parse_variants in community_export retries the decode when the
  first parse returns a string, so bundles built from existing broken
  DBs still produce clean output.
- One-shot _repair_double_encoded_variants migration in schema.py;
  re-encodes affected rows on next container start. Stamped via
  variant_reencode_revision setting, idempotent.

Also: dialog CLI snippet now suggests `gh pr create --fill --label pattern`
so the label gets requested directly. The labeler workflow still adds it
automatically on path match; this is belt-and-suspenders.

Tests: +2 in test_community_export (double-encoded repair + idempotent
on clean rows). 1084 unit tests pass.
@github-actions
Copy link
Copy Markdown

Community pattern validation

Passed (12)

  • patterns/community/capital-one-44026a0f.json (sponsor: Capital One)
  • patterns/community/carvana-a934bf9a.json (sponsor: Carvana)
  • patterns/community/carvana-f41cf617.json (sponsor: Carvana)
  • patterns/community/instacart-b77d02bc.json (sponsor: Instacart)
  • patterns/community/kayak-3f2f65ff.json (sponsor: Kayak)
  • patterns/community/mint-mobile-17b2b4b8.json (sponsor: Mint Mobile)
  • patterns/community/monday-com-9e83a5f6.json (sponsor: Monday.com)
  • patterns/community/progressive-1c07273d.json (sponsor: Progressive)
  • patterns/community/simplisafe-c114bd7f.json (sponsor: SimpliSafe)
  • patterns/community/squarespace-b052ed12.json (sponsor: Squarespace)
  • patterns/community/threatlocker-6b1b16df.json (sponsor: ThreatLocker)
  • patterns/community/zyn-3c348177.json (sponsor: Zyn)

Validation passed. Ready for review.

See patterns/CONTRIBUTING.md for the full submission guide.

@ttlequals0 ttlequals0 merged commit 25abc35 into main May 15, 2026
9 checks passed
@ttlequals0 ttlequals0 deleted the feat/community-patterns-tagging branch May 15, 2026 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants