Skip to content

fix: guard auth-profiles.json writes with gateway-compatible file lock (#758)#759

Open
octo-patch wants to merge 1 commit intoValueCell-ai:mainfrom
octo-patch:fix/issue-758-auth-profiles-file-lock
Open

fix: guard auth-profiles.json writes with gateway-compatible file lock (#758)#759
octo-patch wants to merge 1 commit intoValueCell-ai:mainfrom
octo-patch:fix/issue-758-auth-profiles-file-lock

Conversation

@octo-patch
Copy link
Copy Markdown
Contributor

Problem

Resolves #758.

When a user saves a second (or later) provider, the API key intermittently disappears and an error toast appears.

Root cause: The OpenClaw gateway guards every write to auth-profiles.json with a .lock sidecar file via withFileLock() / acquireFileLock() (in node_modules/openclaw/dist/file-lock-B4wypLkV.js). Inside that lock it does a fresh READ → MODIFY → WRITE. ClawX's writeAuthProfiles() was doing a plain fs.writeFile() without acquiring the same lock. This created a race:

  1. Gateway holds the lock, reads a stale copy of auth-profiles (before ClawX's write)
  2. ClawX writes the new provider key (no lock)
  3. Gateway releases the lock and writes its stale copy → new key overwritten

The first provider always works because the gateway is not yet running when it is saved. Subsequent providers race with a live gateway process.

Solution

Add withAuthProfilesLock(agentId, fn) that implements the same .lock sidecar file protocol as the gateway:

Detail Value
Lock path auth-profiles.json.lock
Acquire fs.open(lockPath, 'wx') — atomic, throws EEXIST on collision
Lock body JSON { pid, createdAt }
Stale detection PID not alive or lock older than 30 s
Release fs.rm(lockPath, { force: true }) in finally

All four functions that perform a read-modify-write on auth-profiles — saveProviderKeyToOpenClaw, saveOAuthTokenToOpenClaw, removeProviderKeyFromOpenClaw, removeProviderFromOpenClaw — now wrap their per-agent loop body with this lock.

Tests

Three new tests in tests/unit/openclaw-auth.test.ts:

  • Lock cleanup: .lock sidecar file is removed after a successful write
  • Stale lock recovery: a lock file referencing a dead PID is auto-cleared and the operation proceeds
  • Concurrent serialisation: two simultaneous saveProviderKeyToOpenClaw calls on the same agent both commit their keys (neither overwrites the other)

ValueCell-ai#758)

ClawX's writeAuthProfiles() was doing a plain fs.writeFile() while the
OpenClaw gateway uses a .lock sidecar file (acquireFileLock / withFileLock
from file-lock-B4wypLkV.js) for exclusive access to the same file.  This
created a race where the gateway's locked READ-MODIFY-WRITE could overwrite
a provider key that ClawX had just saved (or vice-versa), causing the second
and subsequent providers to silently disappear from auth-profiles.json.

Add withAuthProfilesLock() that mirrors the gateway's protocol:
- Lock path: auth-profiles.json.lock (sidecar file)
- Acquire: fs.open(lockPath, 'wx') -- atomic, throws EEXIST on collision
- Lock body: JSON { pid, createdAt } for stale detection
- Stale: PID no longer alive OR lock older than 30 s -> remove and retry
- Release: fs.rm(lockPath, { force: true }) in a finally block

Wrap the read-modify-write loops in saveProviderKeyToOpenClaw,
saveOAuthTokenToOpenClaw, removeProviderKeyFromOpenClaw, and
removeProviderFromOpenClaw with this lock so every Electron-side
auth-profiles write participates in the same mutual-exclusion protocol
as the gateway.

Add three tests: lock sidecar is cleaned up on success, stale lock
(dead PID) is auto-cleared, and concurrent saves within the same process
are serialised so neither key is lost.
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96dae9f3e0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +266 to +267
} catch {
return true; // unreadable / malformed lock → treat as stale
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don't delete lock files on transient parse/read failures

When open(lockPath, 'wx') succeeds, the lock file exists before its JSON body is fully written; a concurrent writer that gets EEXIST can read an empty/partial file, hit this catch, mark it stale, and rm a still-live lock. On POSIX this unlink can succeed even while the first process still holds the fd, allowing both processes into the critical section and reintroducing lost updates to auth-profiles.json under contention.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: 配置自定义模型提供商时,\.openclaw\agents\main\agent文件夹下面的auth-profiles没有同步添加信息

1 participant