Skip to content

[Bug] errcode -14 session expiry enters 60-min death loop — outbound messages blocked, no auto-recovery #155

@Ccccc-del

Description

@Ccccc-del

Describe the bug

When the WeChat iLink server returns errcode: -14 (session expired) after extended inactivity, the plugin enters a 60-minute unrecoverable death loop:

  1. monitor.ts:114-126: pauseSession(accountId) → 60-minute global lock
  2. session-guard.ts:43-50: assertSessionActive() blocks ALL outbound traffic (including cron/message-tool sends), even though outbound REST APIs (sendMessage, sendTyping) are independent of the long-poll session
  3. After 60 minutes: retries getUpdates without calling notifyStart first → gets -14 again → another 60-minute pause → infinite loop

The only escape is the OpenClaw core channelHealthMonitor (default: 5-min check, 30-min stale-socket threshold → restart channel → notifyStart). But that's a fallback, not a proper fix — messages are lost for up to 30 minutes.

Steps to reproduce

  1. Let the WeChat channel sit idle for several hours (overnight)
  2. Have a cron job send a message via the channel
  3. Observe: cron executes successfully but message never arrives in WeChat
  4. Check logs: getUpdates: session expired (errcode=-14), pausing all requests for 60 min

Expected behavior

  • -14 should trigger notifyStart to rebuild the server session, then retry getUpdates within seconds
  • Outbound REST API calls (sendMessage, sendTyping) should not be blocked by the long-poll session state

Proposed fix

Two changes:

1. src/api/session-guard.ts — add clearSessionPause(accountId):

export function clearSessionPause(accountId: string): void {
  pauseUntilMap.delete(accountId);
}

2. src/monitor/monitor.ts — fix the -14 branch:

// Before
if (isSessionExpired) {
  pauseSession(accountId);
  await sleep(pauseMs, abortSignal);
  continue;
}

// After
if (isSessionExpired) {
  pauseSession(accountId);
  try {
    await notifyStart({ baseUrl, token, timeoutMs: 10_000 });
    clearSessionPause(accountId);
    log('session recovered via notifyStart');
  } catch (err) {
    errLog(`notifyStart failed during recovery: ${String(err)}`);
    // Fall back to a short retry; the 60-min pause is still the last resort
  }
  await sleep(5_000, abortSignal);
  continue;
}

Environment

  • Plugin: @tencent-weixin/openclaw-weixin v2.4.1
  • OpenClaw: 2026.5.7
  • OS: Windows 11

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions