Skip to content

chore(defaults): refresh RPC/LCD endpoints (2026-05-02)#28

Merged
Sentinel-Bluebuilder merged 1 commit intomasterfrom
feat/refresh-rpc-endpoints-2026-04-30
May 2, 2026
Merged

chore(defaults): refresh RPC/LCD endpoints (2026-05-02)#28
Sentinel-Bluebuilder merged 1 commit intomasterfrom
feat/refresh-rpc-endpoints-2026-04-30

Conversation

@Sentinel-Bluebuilder
Copy link
Copy Markdown
Owner

Why

rpc.sentinel.co:443 was reporting catching_up: false while serving ABCI state ~22k blocks behind tip. Consumers querying balances against a known funded address (sent1uav3z70yynp4jnt39c6pg3d6ujw78m52v2h7gs, expected 10000000000 udvpn) saw 0. This shipped through to the Plan Manager wallet UI as "0 P2P balance" for funded operators, and any other SDK consumer relying on the default RPC list was equally broken.

The root cause is a class of failure: /status lies — a Tendermint node can report catching_up: false and still serve stale ABCI state. The only reliable health check is to query a known funded address and verify the balance matches an expected value.

What

  • defaults.js — replace RPC_ENDPOINTS (5 → 13) and LCD_ENDPOINTS (4 → 6) with audited, latency-sorted lists. rpc.sentinel.co:443 and lcd.sentinel.co are kept last as "stale fallback" so consumers don't lose them entirely, but every other endpoint precedes them. LAST_VERIFIED bumped to 2026-05-02T00:00:00Z.
  • tools/audit-rpc-endpoints.mjs (new) — end-to-end audit script: connect → /status → ABCI bank balance → latency. Imports RPC_ENDPOINTS from defaults.js so the SDK list is the canonical candidate source; outputs paste-ready entries. Run before each release.
  • ai-path/FAILURES.md — Quick Rule 43 + CHAIN section C17 entry documenting the failure mode and the audit-script prevention rule (per the SDK PR-workflow rule: every code fix ships its doc update in the same diff).

Audit results (2026-05-02)

12 / 22 candidates healthy, sorted by latency:

ms endpoint
125 rpc-sentinel.busurnode.com
459 sentinel-rpc.publicnode.com
470 rpc.trinitystake.io
643 rpc.sentinel.validatus.com
666 sentinel-rpc.polkachu.com
920 rpc.dvpn.roomit.xyz
923 rpc.sentinel.quokkastake.io
962 rpc.sentinel.suchnode.net
1035 rpc-sentinel.chainvibes.com
2323 rpc.sentineldao.com
2380 rpc.mathnodes.com
3935 rpc.sentinel.chaintools.tech

Failed (excluded): rpc.sentinel.co (stale ABCI), Notional, BadgerBite, ValidatorNode, Stakewolle, Decloud Nodes Lab, dvpn.me, ro.mathnodes, Noncompliant, Quasar.

Consumer-app trace (Plan Manager)

Desktop/plans (Plan Manager) wallet UI returned 0 P2P for the configured operator wallet. Trace landed in lib/chain.js's createRpcQueryClientWithFallback() → first endpoint = rpc.sentinel.co:443 → balance 0 returned. Plan Manager has applied a temporary local mutation (removeRpcEndpoint + addRpcEndpoint) which will be reverted once this PR ships, so the SDK's defaults are the only list of record.

Test plan

  • node tools/audit-rpc-endpoints.mjs — 12/22 healthy, deduped against SDK list, paste-ready output stamped today's date
  • node -e "import('./defaults.js').then(d => console.log(d.RPC_ENDPOINTS[0]))" — first entry is busurnode (125ms)
  • Plan Manager smoke: with the temporary local patch removed, balance query against funded address returns 10000000000 (10000.00 P2P)
  • Reviewer sanity: re-run node tools/audit-rpc-endpoints.mjs on a fresh checkout to confirm list still passes

The bundled `RPC_ENDPOINTS` was 5 entries verified 2026-03-08, with
`rpc.sentinel.co` first. As of 2026-05-02 that node has been ~22k blocks
behind tip while still reporting `catching_up: false` -- so consumers
calling `createRpcQueryClient()` without an explicit URL got a node that
appeared healthy via /status but served stale ABCI state on every read.
Concrete failure: a Plan Manager wallet endpoint returning 0 P2P for an
address that holds 10,000 P2P, traced through the SDK fallback to
rpc.sentinel.co.

Audited 22 candidate endpoints (existing list + cosmos chain-registry +
suchnode) end-to-end: connect, /status sync flag, AND ABCI bank balance
query against a known funded address. 12 are healthy and serving correct
state. New ordering is by measured latency.

Changes:
- `defaults.js`: replace RPC_ENDPOINTS (5 -> 13) and LCD_ENDPOINTS (4 -> 6)
  with the audit results. Stamped verified=2026-05-02. `rpc.sentinel.co`
  and `lcd.sentinel.co` kept LAST as fallback (some integrators hardcode
  them); flagged in name as "stale fallback".
- `tools/audit-rpc-endpoints.mjs`: new script that any contributor or CI
  job can run before a release. Tests connect + sync + balance correctness
  (not just /status -- that's exactly the check that lied). Outputs a
  paste-ready RPC_ENDPOINTS block.
- `ai-path/FAILURES.md`: new entry C17 + Quick Rule 43 documenting the
  "/status lies" failure mode so future builders verify ABCI correctness,
  not just sync flag.

No API changes. Existing `addRpcEndpoint`/`removeRpcEndpoint`/`setEndpoints`/
`optimizeEndpoints` continue to work as before. Apps that override
endpoints at runtime are unaffected.

Tested:
- `node tools/audit-rpc-endpoints.mjs` -> 12/22 healthy, list matches
  what's now in defaults.js
- `import('./defaults.js')` smoke test: DEFAULT_RPC resolves to
  rpc-sentinel.busurnode.com (fastest verified, 125ms)
- Patched Plan Manager (the consumer that hit the bug) confirmed wallet
  endpoint now returns correct 10,000 P2P balance after the SDK list is
  reordered via addRpcEndpoint() at startup.
Sentinel-Bluebuilder added a commit to Sentinel-Bluebuilder/sentinel-node-tester that referenced this pull request May 2, 2026
…026-05-02)

rpc.sentinel.co was reporting `catching_up: false` while serving ABCI state
~22k blocks behind tip. Audit-script run against a known funded address
(sent1uav3z70yynp4jnt39c6pg3d6ujw78m52v2h7gs) returned `0 udvpn` from
sentinel.co while every other endpoint returned the correct balance — the
exact failure mode that breaks balance/feegrant flows on the tester.

  - core/constants.js: replace 4-entry RPC_ENDPOINTS with 13 audited entries
    (busurnode first, sentinel.co kept last as stale-fallback). Default RPC
    flips to busurnode (~125ms). LCD list refreshed similarly.
  - core/tkd-bridge.js: defer to RPC_ENDPOINTS in constants.js so future
    refreshes only land in one place.
  - server.js: replace two hardcoded `rpc.sentinel.co:443` SigningStargate
    connects (initial balance + periodic refresh) with a connectWithRpcFailover
    helper that walks RPC_ENDPOINTS in order.
  - .env.example, README.md: flip RPC default to busurnode and document why.
  - scripts/audit-rpc-endpoints.mjs: end-to-end audit (connect + /status +
    ABCI bank balance + latency). Run before each release.

Smoke-tested: 12/13 endpoints healthy, deltas within 1 block. Mirrors the
parallel SDK PR Sentinel-Bluebuilder/blue-js-sdk#28.
@Sentinel-Bluebuilder Sentinel-Bluebuilder merged commit fbcf44c into master May 2, 2026
2 checks passed
@Sentinel-Bluebuilder Sentinel-Bluebuilder deleted the feat/refresh-rpc-endpoints-2026-04-30 branch May 2, 2026 20:23
Sentinel-Bluebuilder added a commit that referenced this pull request May 2, 2026
…-02) (#29)

The 2026-05-02 follow-up to PR #28 surfaced two issues with the audit:

1. The previous script hardcoded EXPECTED_UDVPN as a constant. The audited
   wallet had a small outflow between PR authoring and merge, so re-running
   the script declared 0/22 healthy when in fact 12 endpoints all agreed on
   the new (correct) balance and only rpc.sentinel.co was actually broken.

2. defaults.js still listed rpc.sentinel.co / lcd.sentinel.co as a "stale
   fallback" entry. Today's audit confirms both are still ~22k blocks behind
   tip and return 0 for funded addresses while reporting catching_up=false.
   Keeping them in the array meant a tryWithFallback() call could still hit
   them after every other endpoint failed and silently return wrong data.

Changes:

- tools/audit-rpc-endpoints.mjs — replace hardcoded EXPECTED_UDVPN with a
  consensus check (modal balance across responding candidates, within 50
  blocks of tip). Survives any future outflow from the audit address. Also
  audits LCD endpoints in the same run; the previous script only covered RPC.
- defaults.js — drop rpc.sentinel.co and lcd.sentinel.co. Re-sort RPC list
  by today's measured latency. Expand LCD list 6 -> 9 entries (Roomit,
  ChainTools, ChainVibes, Validatus all newly verified).

Audit results (2026-05-02, consensus mode):
- RPC: 12/22 healthy, 12/13 responding agree on balance
- LCD: 9/12 healthy, 9/10 responding agree on balance

Run: node tools/audit-rpc-endpoints.mjs

Test plan
- node -e "import('./defaults.js').then(d => console.log(d.DEFAULT_RPC, d.DEFAULT_LCD))" -> Busurnode for both
- Audit script exits 0 with both RPC and LCD tier-1 lists populated

Co-authored-by: Sentinel-Autonomybuilder <[email protected]>
@Sentinel-Bluebuilder Sentinel-Bluebuilder mentioned this pull request May 2, 2026
2 tasks
Sentinel-Bluebuilder added a commit that referenced this pull request May 2, 2026
Patch release shipping the consensus-audited RPC + LCD endpoint defaults
from #28 and #29 to npm consumers.

- package.json: 2.7.0 -> 2.7.1
- defaults.js: SDK_VERSION 2.4.0 -> 2.7.1 (was lagging package.json)

No API surface changes. Defaults-only update -- consumers calling
createRpcQueryClientWithFallback() / SentinelClient with no overrides will
now skip the stale rpc.sentinel.co / lcd.sentinel.co endpoints that were
returning balance=0 for funded addresses.

Co-authored-by: Sentinel-Autonomybuilder <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant