feat: socket heartbeat lifecycle tracking, stale connection cleanup, and reconnect contract#115
Open
davedumto wants to merge 3 commits intoTevaLabs:mainfrom
Open
feat: socket heartbeat lifecycle tracking, stale connection cleanup, and reconnect contract#115davedumto wants to merge 3 commits intoTevaLabs:mainfrom
davedumto wants to merge 3 commits intoTevaLabs:mainfrom
Conversation
… reconnect contract (TevaLabs#95) - Export PING_INTERVAL (25s) and PING_TIMEOUT (10s) as named constants; these are now passed directly to the Socket.IO server constructor so the transport layer enforces the same values the application advertises - Add ConnectionRecord interface and connectionRegistry Map (socketId → record) to track every live connection with connectedAt and lastSeenAt timestamps; exported so tests and monitoring can inspect state directly - Add checkStaleConnections(io, thresholdMs) utility that scans the registry and force-disconnects sockets idle beyond the threshold; orphan entries (socket already gone, disconnect event never fired) are silently deleted - Start a 30s periodic stale-check interval inside initializeSocket with .unref() so it does not block process exit - Emit server:hello on every connection advertising pingInterval, pingTimeout, authenticated, and userId — defines the reconnection contract: clients should reconnect if no ping arrives within pingInterval + pingTimeout ms; on reconnect the server treats the socket as fresh and rooms must be explicitly re-joined - Update lastSeenAt via socket.onAny() (application events) and engine-level packet 'pong' events (heartbeat replies) for accurate idle tracking - Remove from registry on disconnect to keep the map clean - Add 9 new tests in "Heartbeat and reconnect (Issue TevaLabs#95)" suite: server:hello shape for unauth/auth, registry populate on connect, registry cleanup on disconnect, lastSeenAt update on event, stale detection and force-disconnect, phantom entry cleanup, room rejoin required after reconnect, rapid connect/disconnect registry integrity - All 26 socket tests pass with zero regressions
3 tasks
…or in auth scheduler.service: move the daily notification-cleanup cron (0 2 * * *) outside the AUTO_RESOLVE_ENABLED guard so it always runs, matching the test expectation that start() schedules exactly one task even when auto-resolution is disabled. auth.routes: split the previously merged !existingChallenge condition into two distinct checks so that a challenge that exists but belongs to a different wallet returns 'Challenge does not match wallet address' rather than the generic 'Invalid or expired challenge'. Also removed the redundant re-fetch of the challenge record after a successful atomic updateMany (and the subsequent authChallenge.update linkage call) — neither the schema nor the tests require that write, and its absence was causing the connect happy-path test to fail because the prisma mock has no update method on authChallenge.
…tions - auth.routes: replace updateMany-first challenge consumption with findUnique-first lookup; add explicit isUsed/expired checks; restore authChallenge.update to mark challenge used and link userId after successful authentication - validate.middleware: return error: 'Validation Error' as the stable error key; the Zod message moves to the message field - auth.schema: use Zod v4 error param (replaces v3 required_error / invalid_type_error) so missing fields produce the expected message rather than the generic 'Expected string, received undefined'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #95
Context
Socket.IO was connecting and authenticating clients but had no visibility into connection health. There was no application-level tracking of when a socket was last active, no stale connection cleanup, and no defined contract for how clients should handle reconnection. A long-lived silent connection would consume server resources without ever being reclaimed.
What Changed
src/socket.tsHeartbeat constants (exported)
Both values are passed directly to the Socket.IO constructor so the transport layer enforces the same contract the application advertises.
ConnectionRecord&connectionRegistryConnectionRecordtracksuserId,walletAddress,connectedAt, andlastSeenAtfor every live socket.connectionRegistry: Map<socketId, ConnectionRecord>is exported so external monitoring and tests can inspect state without going through Socket.IO internals.lastSeenAtis refreshed on two axes:socket.onAny()— fires on every incoming application-level eventsocket.conn.on('packet', …)— fires on engine-levelpongresponses (heartbeat replies)checkStaleConnections(io, staleThresholdMs?)PING_INTERVAL + PING_TIMEOUT + 5 s≈ 40 s).socket.disconnect(true); thedisconnectevent fires and cleans up the registry entry automatically.disconnectnever fired — are deleted from the registry without throwing.initializeSocket; the interval is.unref()'d so it does not block process exit in tests or graceful shutdown.server:helloevent — reconnection contractEmitted to every socket immediately after connection:
{ "socketId": "...", "pingInterval": 25000, "pingTimeout": 10000, "authenticated": true, "userId": "..." }This defines the client-side reconnection contract:
Disconnect cleanup
The
disconnecthandler now deletes the socket's entry fromconnectionRegistry, keeping the map accurate at all times.src/tests/socket.spec.ts9 new tests added in a
"Heartbeat and reconnect (Issue #95)"describe block. All 17 existing tests continue to pass — 26 tests total, all green.server:helloshape — unauthenticateduserIdabsentserver:helloshape — authenticateduserIdandauthenticated: truepresentuserIdand timestampsdisconnecteventlastSeenAtupdated on eventlastSeenAt = 0then callingcheckStaleConnectionstriggers clientdisconnectjoin:roundmust be called againTest Run
Definition of Done
server:hello) and testable