Small cleanup followups by LandynDev · Pull Request #47 · entrius/allways

LandynDev · 2026-04-15T20:22:06Z

No description provided.

The axon handler is the only path into the pending_confirms queue and already verifies tx sender against the reserved source address before enqueueing. The second check in the forward-loop re-verify was dead code — a confirmed on-chain tx can't swap senders out from under us. Also drop the 2-tuple back-compat in SwapFulfiller.load_sent_cache. The old shape never shipped to anyone who could still have a cache file, and carrying the fallback clutters the restore path.

Split ChainProvider.verify_transaction into a private abstract _fetch_matching_tx (chain-specific RPC/scan) and a concrete verify_transaction on the base that layers two optional post-checks every caller used to open-code: require_confirmed and expected_sender. Callers collapse a 3-rung ladder into one None check: miner/fulfillment.verify_user_sent_funds — passes both params; the source tx must be confirmed and sent by the reserved user address. validator/axon_handlers.handle_swap_confirm — passes expected_sender only (unconfirmed txs still need to reach the queue branch). Keeps the inline confirmed check for the queue-vs-initiate decision. SwapVerifier.verify_tx keeps its inline sender check: it uses strict semantics (empty sender fails) while the base uses lenient (empty sender allowed, matching the other two callers' pre-refactor behavior). Promoting one to the other would be a security-posture change — out of scope for a logic-preserving refactor. Rejection logs move to the base so the defense is observable in one place instead of duplicated at every call site. No functional change.

Exception and None branches both did the same retry dance — bump the counter, resolve on limit, persist otherwise. Normalize exceptions to None at the top of the loop and extract the counter into a small _bump_null_retry helper so the two branches become one.

The base verify_transaction check was lenient: an empty/unparseable sender from the provider (failed vin resolution, malformed extrinsic) passed through without rejection. A crafted or broken tx could then evade the user-snipes-miner defense because we couldn't prove who sent it. Tighten to strict — if we can't identify the sender, we don't approve the fund flow. Miner fulfillment and axon confirm paths adopt the stricter check automatically through the shared base. SwapVerifier.verify_tx now delegates None/recipient/amount/sender to the base and keeps only the rate-limited confirmations debug log (validator-polling specific), extracted into a private helper. The defense lives in exactly one place shared by miner, axon, and validator verification flows.

Commitment polling fires every 15 blocks (20x/hour); the prune DELETE was running on every tick for no reason. Pruning is bounded-growth hygiene, not correctness — the retention cutoff is 2x the scoring window, so a once-per-scoring-round cadence is more than enough. Moves prune_aged_rate_events into run_scoring_pass so it runs ~1x/hour instead of ~20x/hour. Zero semantic change. Test updated to cover the new contract: commitment polling does not prune, scoring pass does.

Three tied changes that together move rate sampling from coarse-grained periodic to per-block precision: 1. read_miner_commitments now uses substrate query_map over the Commitments.CommitmentOf double map — one RPC round-trip per call regardless of miner count, instead of N separate queries in a for-loop. Turns ~5000 RPC/hour into ~300 RPC/hour at ~256 miners. 2. poll_commitments loses its 15-block interval gate. Cost is low enough now that sampling every forward step (~12s) is the natural cadence, giving the crown-time series ~1 block accuracy. 3. RATE_UPDATE_MIN_INTERVAL_BLOCKS throttle is gone. The validator no longer drops rate events that land within 75 blocks of the previous one — shorter inter-rate gaps become real crown intervals instead of getting collapsed. Miners can now post rate updates as often as the chain-level commitment cooldown allows and every change is attributed. Only dedupe kept: same-rate observations on the same direction. Obsolete COMMITMENT_POLL_INTERVAL_BLOCKS, RATE_UPDATE_MIN_INTERVAL_BLOCKS, and the orphaned last_commitment_poll_block field are all removed. Tests that exercised the throttle are replaced with single-block-gap and same-rate-dedupe coverage.

Crown-time reward replay now looks back 4 hours instead of 12. Event retention auto-derives to 8h (2 * window). Shorter window makes the incentive more responsive to recent behavior — a miner climbing with a better rate sees it reflected in weights faster, and timed-out swaps in the last few hours carry proportionally more weight in the success-rate penalty. SCORING_INTERVAL_STEPS (how often scoring runs) stays at 300 steps (~1h) — independent from window size. Each scoring pass covers the last 4h, EMA in update_scores smooths across passes.

swap_outcomes used to be all-time. A miner with a bad streak a year ago would carry that penalty forever, and the table grew unbounded. Add a rolling 30-day credibility window: - get_all_time_success_rates → get_success_rates_since(block) with a WHERE resolved_block >= ? filter. Calculate_miner_rewards passes self.block - CREDIBILITY_WINDOW_BLOCKS. - prune_swap_outcomes_older_than runs from the scoring pass alongside the rate-events prune, bounding table growth. - New idx_swap_outcomes_resolved_block index makes both the read and the prune O(log n) instead of scanning the table. - Docstring on the store module updated: swap_outcomes is no longer all-time, it's a rolling credibility ledger. Miners with no outcomes in the last 30 days still default to 1.0 (optimistic). Miners with bad history can rehabilitate by completing swaps over the next month.

Functions taking validator, event_watcher, commitment, swap, and other class instances had no type annotations, so IDEs showed them as Any and editor navigation broke. Annotate the highest-traffic ones: - axon_handlers.py: validator (Validator), commitment (MinerPair), synapse (bt.Synapse) on all handler + helper functions. - forward.py: event_watcher (ContractEventWatcher) on replay_crown_time_window, item (PendingConfirm) on try_extend_reservation. - dendrite_lite.py: contract_client (AllwaysContractClient), synapse (bt.Synapse), axons (List[bt.AxonInfo]). - utils/rate.py: swap (Swap) on expected_swap_amounts. Validator and event_watcher use string forward refs + TYPE_CHECKING to avoid circular imports. Zero runtime change.

The crown-time replay used to credit any active miner that held the best rate, even while that miner was mid-swap and couldn't actually serve new users. A solo best-rate miner could never leak credit because the only eligibility gate was the active flag. Teach ContractEventWatcher to track open-swap counts per miner: - SwapInitiated increments the count (+1 BusyEvent) - SwapCompleted / SwapTimedOut decrements the count (-1 BusyEvent) - count > 0 means the miner is busy and should be excluded from crown get_busy_events_in_range and get_busy_miners_at feed the crown-time replay (next commit). initialize now also queries contract_client. get_active_swaps() to seed busy state at cold start so a miner serving a swap at watcher startup isn't treated as idle until their next terminal event. Pruning is conservative — open intervals are kept even when their start events age past the retention cutoff so a long-running swap doesn't get silently forgotten mid-flight. Only touches the watcher and its unit tests; scoring replay wiring follows in a separate commit.

Thread busy-interval exclusion through replay_crown_time_window and crown_holders_at_instant. A miner with an open swap at a given block no longer earns crown time for that interval — the tied best rate among idle eligible miners wins instead, flowing credit to the next- best runner-up. When no eligible miner is idle the interval recycles. replay_crown_time_window now merges a third event stream (busy starts and ends from the watcher) alongside rates and collateral, seeds the running busy count from watcher.get_busy_miners_at(window_start) for pre-window swaps that straddle the window edge, and derives the busy set from the running count on each interval. Ordering at coincident blocks is busy (0) < collateral (1) < rate (2) so that a reservation at the same block as a rate change takes effect first. crown_holders_at_instant gains an optional busy: Set[str] filter. Test coverage: - crown_holders_at_instant: busy-excluded and all-busy cases - replay: runner-up earns during busy interval, solo miner goes busy and pool recycles, pre-window initiate reconstructed correctly

read_miner_commitments had no direct unit test — the refactor from N-RPC loop to query_map went in untested except via E2E suites. Add four cases mocking substrate.query_map: - Parses every registered miner's commitment into a MinerPair - Drops dereg'd hotkeys whose commitment is still in storage - Regression guard: exactly one query_map call, zero single-query calls - ConnectionError/TimeoutError during query_map returns an empty list instead of raising into the forward loop

The merged replay stream was a List[Tuple[int, int, str, str, float]] with stringly-typed 'kind' dispatch — unreadable. Replace with: - EventKind IntEnum whose values ARE the coincident-block sort order (BUSY=0 < COLLATERAL=1 < RATE=2) so the ordering invariant is baked into the type instead of hidden in a magic-number tuple field. - ReplayEvent dataclass with block/hotkey/kind/value + a sort_key property. Polymorphic value documented per kind. - _reconstruct_window_start_state and _merge_replay_events extracted as small helpers with narrow signatures. - replay_crown_time_window's body becomes three numbered steps, each a single call. The inner loop reads: credit previous span, apply this event, advance cursor. Behavior is identical — same ordering invariants, same crown attribution math. Every existing test in test_scoring_v1.py passes unchanged (264 total).

The previous form computed the right answer via a dict comprehension plus max() but didn't narrate the fallback behavior: when the best-rate miner is disqualified, credit walks down to the next-best rate that has a qualified miner. The rule should read like the rule. - Bucket hotkeys by rate - Walk best rate → worst - First bucket with any qualifying miner wins the crown - Ties at that rate split credit (caller splits interval duration) Pure readability refactor — same candidates, same tie behavior. 264 existing tests pass unchanged, including the busy-best-rate-loses case that already covered the fallthrough semantics.

Rename every underscore-prefixed helper introduced on this branch to plain names. The convention signalled "private" to nobody who wasn't already reading the file, and it clutters call sites with pseudo- access-modifier noise the codebase doesn't use elsewhere. Renames: - ChainProvider._fetch_matching_tx -> fetch_matching_tx - BitcoinProvider / SubtensorProvider concrete overrides follow. - SwapVerifier._log_confs_progress -> log_confs_progress - swap_tracker._bump_null_retry -> bump_null_retry - forward._reconstruct_window_start_state -> reconstruct_window_start_state - forward._merge_replay_events -> merge_replay_events Behavior unchanged, 264 tests green.

forward.py was 738 lines mixing four concerns: the per-step forward loop, commitment polling, pending-confirm drain, fulfillment/timeout logic, and the entire crown-time scoring algorithm. The scoring code is self- contained and significantly different in texture, so move it out: - New allways/validator/scoring.py holds every crown-time symbol: run_scoring_pass, prune_aged_rate_events, prune_stale_swap_outcomes, calculate_miner_rewards, success_rate, EventKind, ReplayEvent, reconstruct_window_start_state, merge_replay_events, replay_crown_time_window, crown_holders_at_instant. - forward.py keeps the actual forward step and its direct helpers (commitment poll, pending reservations, fulfillment confirm/extend/ timeout). Organized under ─── section comments that match the step numbers in the forward() body. - forward() rewritten with blank-line separated numbered phases, each phase prefaced with a one-line inline comment saying what it does and why the ordering matters. Reads as a checklist instead of a dense call graph. Tests updated: scoring symbols import from scoring now instead of forward. 264 tests pass unchanged.

``state[2]`` and ``(to_tx_hash, to_tx_block, False)`` was unreadable — positional tuple access with a magic-number index and no names for three distinct fields. Wrap it in a SentSwap dataclass with named attributes (to_tx_hash, to_tx_block, marked_fulfilled) and update: - load_sent_cache / save_sent_cache use named construction + named attribute serialization instead of index juggling - process_swap renames the local from ``state`` to ``sent`` so it reads like English: ``if sent and sent.marked_fulfilled``, not ``if state is not None and state[2]`` - The three-state lifecycle (no record / sent but not marked / marked) is documented in the process_swap docstring so a reader can match each branch in the body to the named state it handles - marked_fulfilled flip is now an in-place mutation on the cached dataclass instead of replacing the tuple Behavior unchanged, 264 tests green.

Two paired changes that fix the post-restart scoring gap and drop redundant history buffering: 1. Event watcher initialize() now rewinds the cursor to ``max(0, head - SCORING_WINDOW_BLOCKS)`` instead of starting at head. The existing bounded sync_to catches up at 50 blocks per forward step, so a full window of 1200 blocks backfills over ~24 forward steps (~5 min). The first scoring pass runs at step 300 (~1h), long after catch-up — no degraded window after a restart. Module docstring was already promising this behavior; code now actually does it. 2. EVENT_RETENTION_BLOCKS drops from 2x scoring window to 1x. The 2x buffer existed so get_latest_*_before(window_start) could always find state from BEFORE the window opens. Replace the buffer with a smarter prune: always preserve the latest row per (hotkey, direction) regardless of age, so a miner who posts once and never updates still has an anchor row for state reconstruction. Prune fixes: - state_store.prune_events_older_than: SQL now keeps MAX(id) per (hotkey, from_chain, to_chain) group; only rows below cutoff AND not-the-latest get deleted. - event_watcher.prune_old_collateral_events: cutoff halved, latest event per hotkey preserved in both the flat list and the per-hotkey index. - Busy event prune unchanged (still only drops closed intervals). Tests updated for new cursor math and the preserve-latest prune semantics. 267 pass (264 old + 3 new prune-edge tests).

Both constants were redundant aliases for SCORING_WINDOW_BLOCKS: - EVENT_RETENTION_BLOCKS was already `= SCORING_WINDOW_BLOCKS`. The prune helper now references the window directly. - SCORING_INTERVAL_STEPS was 300 (~1h), pruning the same 4h window 4× per cycle. Aligned to the window so scoring runs once per window, and forward.py reads the constant directly instead of a step-count alias.

- purge_expired_pending → purge_expired_pending_confirms (the table is named pending_confirms; pending what was ambiguous). - run_scoring_pass → score_and_reward_miners. - prune_aged_rate_events → prune_rate_events. - prune_stale_swap_outcomes → prune_swap_outcomes. Reorder the scoring pass so reward calculation runs first and pruning happens after — pruning is hygiene, never block the math. Drop the `if len(miner_uids) > 0 and len(rewards) > 0` guard in score_and_reward_miners. update_scores already short-circuits on empty rewards/uids in neurons/base/validator.py.

… tracker When the source tx is already confirmed we're about to vote_initiate this same tick — don't waste an extrinsic on an extend that's about to be superseded. Move try_extend_reservation into the not-confirmed branch. Inline EXTEND_THRESHOLD_BLOCKS inside SwapTracker.get_near_timeout_fulfilled instead of taking it as a parameter — every caller passed the same constant.

The tracker's voted_ids only covers terminal confirm/timeout votes, so each forward step was re-sending vote_extend_timeout / vote_extend_reservation until the contract returned AlreadyVoted — wasted gas on every tick during an extension window. Two new caches: - SwapTracker.extend_timeout_voted_at: swap_id → timeout_block we voted under. is_extend_timeout_voted self-clears once the contract bumps the swap past the voted value, so the next round opens cleanly. - Validator.extend_reservation_voted_at: (miner_hotkey, from_tx_hash) → reserved_until at vote time. Same self-clearing rule. Stale entries are dropped at the top of initialize_pending_user_reservations when their pending_confirm is gone. AlreadyVoted responses now feed back into the cache instead of just being swallowed, so a desync recovers on the next observation rather than burning extrinsics.

Code reads like a book — the function and section headers carry the narrative, so block comments restating what the next line does just add noise. Scrub the numbered phase comments off forward(), the long docstrings on helpers whose names already describe them, and the heavy narrative on classes. Keep comments only where the WHY is non-obvious (WAL/busy_timeout ordering, EventKind tie-breaking, prune-anchor preservation, why a delta floors at zero).

LandynDev added 23 commits April 15, 2026 13:04

anderdc approved these changes Apr 16, 2026

View reviewed changes

anderdc merged commit c1c512f into test Apr 16, 2026
2 checks passed

LandynDev mentioned this pull request Apr 17, 2026

skip extend reservation vote when reservation already expired #36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small cleanup followups#47

Small cleanup followups#47
anderdc merged 23 commits intotestfrom
refactor/post-v1-cleanup-followups

LandynDev commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LandynDev commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants