Skip to content

feat(publisher): additive inject lane for live solve-time/score experiments#269

Open
wallscaler wants to merge 3 commits into
mainfrom
feat/inject-lane
Open

feat(publisher): additive inject lane for live solve-time/score experiments#269
wallscaler wants to merge 3 commits into
mainfrom
feat/inject-lane

Conversation

@wallscaler

Copy link
Copy Markdown
Contributor

What

A continuous, additive, env-gated challenge-injection lane that runs alongside the native refill loop. Native refill is untouched, so the board never empties even if this lane is off or stalls — best of both worlds. Default OFF (CATHEDRAL_INJECT_ENABLED); with the flag unset the publisher is byte-identical to today.

Purpose: measure, on the live board, how unpredictable / harder SAT instances affect solve time and scores — without risking board availability.

Injected challenges

  • Isolated — distinct family_id (default gentest). Native refill counts/retires only synthetic_boolean_v1, so injected challenges never eat native slots and native never retires them.
  • Served identicallycnf_source='local', so they're served, HMAC-fetch-gated, witness-verified on submit, and scored exactly like native ones (one signed solve per (challenge, hotkey)).
  • Identifiablechallenge_id embeds the family label and still parses to the correct tier: sat-t{N}-random-3sat-gentest-<seed-hex>. Solve-time is a -gentest- filter.
  • Unpredictable seedsecrets.randbits(63) (OS entropy) instead of the predictable sha256(utc_hour:tier:seq), with per-tier method/shape overrides to dial difficulty.

⚠️ Moves real income

No zero-value mode — a solved injected challenge pays real tier weight. Enabling it on live shifts scores (that's the signal), so: keep targets small, announce, run bounded. Purely additive and reversible (unset flag → injected challenges age out).

Files

  • scaffold/publisher/inject.py — the lane (mirrors refill's env-gated pattern)
  • scaffold/publisher/app.pyseed_challenge gains optional family_id; lane wired into lifespan startup/shutdown
  • measure_inject.py — read-only solve-time / solver-spread report, inject vs native
  • inject_verify.py — gate proving isolation / identifiability / serve-parity
  • INJECT.md — env knobs, income caveat, difficulty ladder, run/measure/stop

Verification

  • python inject_verify.pyINJECT VERIFY PASS (isolation, identifiability, serve parity, non-interference)
  • Smoke-tested inject_once_async: 3 minted into gentest, native count untouched, all cnf_source='local', OS-entropy seeds.

🤖 Generated with Claude Code

wallscaler and others added 3 commits June 19, 2026 16:04
…iments

Adds a continuous, env-gated challenge-injection lane that runs ALONGSIDE the
native refill loop (default OFF). Native refill is untouched, so the board never
empties even if this lane is off or stalls — best of both worlds.

Injected challenges:
  * isolated under a distinct family_id (default 'gentest') so native refill
    never counts/retires them and they never eat native slots;
  * served/gated/verified/scored identically (cnf_source='local');
  * identifiable — challenge_id embeds the family label and still parses to the
    correct tier (sat-t{N}-random-3sat-gentest-<seed-hex>);
  * seeded from secrets.randbits(63) (OS entropy) instead of the predictable
    sha256(utc_hour:tier:seq), with per-tier method/shape overrides — so the
    experiment can isolate the effect of unpredictable / harder instances on
    live solve time and scores.

Note: there is no zero-value mode — solved injected challenges pay real tier
weight, so this moves live income. Keep targets small, announce, run bounded.

New:
  * scaffold/publisher/inject.py — the lane (mirrors refill's env-gated pattern)
  * measure_inject.py — read-only solve-time/solver-spread report, inject vs native
  * inject_verify.py — gate proving isolation/identifiability/serve-parity (PASS)
  * INJECT.md — what it does, env knobs, the income caveat, how to run/measure/stop
Changed:
  * scaffold/publisher/app.py — seed_challenge gains optional family_id; lane
    wired into lifespan startup/shutdown (env-gated, default off)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Paste-a-rung-and-measure guide: rung 0 isolates the seed (same shape as native),
rung 1 forces ajm at m/n≈4.26 to isolate hardness, rung 2+ scales n at a fixed
ratio. Notes the stop condition (solved% → 0 = overshoot) and the non-monotonic
threshold-3SAT caveat (P0 spike).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review caught a blocker: inject_cid embedded the raw seed (sat-...-<seed:016x>),
and the planted model derives from random.Random(seed) — so anyone could read the
seed off the public board and reconstruct the answer with zero solving, defeating
the whole experiment.

Fixes:
* inject_cid suffix is now sha256(tier:family:seed)[:16] — a one-way hash. The
  seed is never published, stored, or logged (dropped seed_hex from the mint log).
  The served CNF is the only artifact and is unreproducible from public fields.
* family_is_safe(): the lane refuses the native family (synthetic_boolean_v1) and
  any non-slug family; inject_once_async fails closed (returns []) on a bad family
  so it can never collide with native counting/retirement on real emissions.
* collision pre-check moved off the event loop (asyncio.to_thread) — postgres
  getconn() blocks; matches every other DB call in the loop.
* inject_verify.py: new §0 SEED SECRECY proves the public id can't regenerate the
  CNF or the planted answer; §5 FAMILY GUARD; tier switched to 2 so the tier-parse
  check is meaningful (tier_from_challenge_id defaults to 1 on failure).
* INJECT.md: documents the opaque id, seed-secrecy, and the family guard.

Note (verified, not a live bug): the native live board does NOT have this leak —
its challenge_id seed and served CNF are decoupled by the background pre-gen queue
(served CNF is seeded by time.monotonic(), unrelated to the id's mint_seed).
Confirmed by regenerating live CNFs from the public id suffix → sha256 mismatch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@wallscaler

Copy link
Copy Markdown
Contributor Author

Review findings addressed (commit 3426ecb)

1. Blocker — seed leaked in the challenge_id. Confirmed and fixed. The id suffix is now sha256(tier:family:seed)[:16] (one-way), not the raw seed. The seed is never published, stored, or logged. inject_verify.py §0 SEED SECRECY now proves the public board fields cannot regenerate the CNF or the planted answer:

[PASS] seed hex does not appear anywhere in the public id
[PASS] id suffix does not decode to the real seed
[PASS] CNF regenerated from the public id does NOT match the served CNF
[PASS] planted answer derived from the public id does NOT solve the served CNF
[PASS] the real planted answer (secret) does solve the served CNF

2. Family isolation depends on one env choice. Fixed. family_is_safe() refuses the native family (synthetic_boolean_v1) and any non-slug value; inject_once_async fails closed (returns []). §5 FAMILY GUARD covers it.

3. Not difficulty calibration. Agreed — it's an additive paid experiment lane, not a calibrator. It does not estimate hardness pre-publish or use a reference solver-time band. INJECT.md frames it as an experiment with the income caveat; a calibrated difficulty ladder remains the separate open question.


On the native board (investigated, not a live bug)

The same seed→id math in this repo's refill.py is structurally leaky (int(suffix,16) == seed). But I verified the live board is not exploitable: I pulled real live ids and regenerated the CNFs from int(suffix,16)sha256 mismatch on both tiers. Reason: live serves CNFs from the background pre-gen queue (seeded by time.monotonic()), which is decoupled from the id's mint_seed. So the public id's seed has nothing to do with the served CNF. Worth a separate hardening pass (make native ids opaque too, for defense-in-depth), but no live emergency.

inject_verify.pyINJECT VERIFY PASS (all 22 checks). The lane remains default-OFF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant