Skip to content

Commit b3faeff

Browse files
JohnMcLearclaude
andauthored
fix(tests): retry rmdir to clear Windows EBUSY flake in updater-integration (#7728)
* fix(tests): retry rmdir to clear Windows EBUSY flake in updater-integration The Windows backend-test job has been intermittently red on `crash-loop guard: bootCount=3 forces immediate rollback` (and other cases in `updater-integration.ts`) with: Error: EBUSY: resource busy or locked, rmdir 'C:\Users\RUNNER~1\AppData\Local\Temp\updater-it-...' Each `it()` builds a temp git repo via `execSync('git ...')` and cleans up in a `try…finally` with `fs.rm(dir, {recursive: true, force: true})`. On Windows, git child processes can briefly hold file handles after exit (NTFS lazy-release / antivirus scan / pack-file handles), so the first rmdir attempt hits EBUSY. `fs.rm`'s default `maxRetries` is 0, so there is no recovery and the test errors out. Hoist the cleanup to a single `cleanupTmp()` helper that passes `maxRetries: 10, retryDelay: 100` (a built-in `fs.rm` capability since Node 14.14). On Linux/macOS this is a no-op — there's nothing to retry. On Windows it absorbs the transient lock. No production code touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tests): poll for rollback terminal state instead of 250ms sleep Windows CI failure on this branch surfaced a *second* flake in the same file (`crash-loop guard: bootCount=3 forces immediate rollback`): TypeError: Cannot read properties of undefined (reading 'execution') at updater-integration.ts:230 `checkPendingVerification` kicks off `performRollback` as fire-and-forget (`void performRollback(s, deps).catch(...)`), and the test waits a flat 250 ms before asserting `states.at(-1)!.execution.status === 'rolled-back'`. On Linux 250 ms is plenty. On Windows, git checkout + spawned-process bookkeeping regularly push past that — so `saveState` hasn't fired yet and `states` is empty. This race was previously masked: the test's `finally` ran `fs.rm`, which threw EBUSY against handles still held by the in-flight rollback, and JS's "finally-throws-override-try-throws" semantics meant mocha reported the EBUSY rather than the underlying TypeError. The retry-rm patch on this branch unmasked it. Replace the flat sleep with condition-based polling (25 ms tick, 10 s ceiling) for a terminal state (`rolled-back` | `rollback-failed`). The existing `assert.equal(... 'rolled-back')` still runs, giving a clean diff if rollback landed on the failure side instead. Linux runtime drops 329 ms → 104 ms because the poll exits as soon as the state lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c7100e0 commit b3faeff

1 file changed

Lines changed: 22 additions & 7 deletions

File tree

src/tests/backend/specs/updater-integration.ts

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,12 @@ import {EMPTY_STATE, UpdateState} from '../../../node/updater/types';
1212
const sh = (cmd: string, opts: any = {}) =>
1313
execSync(cmd, {stdio: 'pipe', ...opts}).toString().trim();
1414

15+
// On Windows, git's child processes can briefly hold file handles after exit
16+
// (NTFS lazy-release / antivirus / pack files), so an immediate rmdir on the
17+
// temp repo hits EBUSY. fs.rm's built-in retry clears the flake.
18+
const cleanupTmp = (dir: string) =>
19+
fs.rm(dir, {recursive: true, force: true, maxRetries: 10, retryDelay: 100});
20+
1521
const buildTmpRepo = async (): Promise<{dir: string; v1Sha: string; v2Sha: string}> => {
1622
const dir = await fs.mkdtemp(path.join(os.tmpdir(), 'updater-it-'));
1723
sh('git init -b main', {cwd: dir});
@@ -94,7 +100,7 @@ describe(__filename, function () {
94100
// The fromSha recorded in state matches the v0.0.1 SHA.
95101
assert.equal((states.at(-1)!.execution as {fromSha: string}).fromSha, v1Sha);
96102
} finally {
97-
await fs.rm(dir, {recursive: true, force: true});
103+
await cleanupTmp(dir);
98104
}
99105
});
100106

@@ -140,7 +146,7 @@ describe(__filename, function () {
140146
const lock = await fs.readFile(path.join(dir, 'pnpm-lock.yaml'), 'utf8');
141147
assert.match(lock, /lockfileVersion: x/);
142148
} finally {
143-
await fs.rm(dir, {recursive: true, force: true});
149+
await cleanupTmp(dir);
144150
}
145151
});
146152

@@ -182,7 +188,7 @@ describe(__filename, function () {
182188
assert.equal(states.at(-1)!.execution.status, 'rolled-back');
183189
assert.equal(sh('git rev-parse HEAD', {cwd: dir}), v1Sha);
184190
} finally {
185-
await fs.rm(dir, {recursive: true, force: true});
191+
await cleanupTmp(dir);
186192
}
187193
});
188194

@@ -219,13 +225,22 @@ describe(__filename, function () {
219225
rollbackHealthCheckSeconds: 60,
220226
});
221227
assert.equal(r.armed, false);
222-
// Wait for the fire-and-forget rollback to finish.
223-
await new Promise((resolve) => setTimeout(resolve, 250));
228+
// Poll for the fire-and-forget rollback to land in its terminal state.
229+
// A flat sleep here was racy on Windows (git checkout + spawned-process
230+
// bookkeeping can push past several hundred ms).
231+
const deadline = Date.now() + 10_000;
232+
while (
233+
states.at(-1)?.execution.status !== 'rolled-back' &&
234+
states.at(-1)?.execution.status !== 'rollback-failed' &&
235+
Date.now() < deadline
236+
) {
237+
await new Promise((resolve) => setTimeout(resolve, 25));
238+
}
224239
assert.equal(states.at(-1)!.execution.status, 'rolled-back');
225240
assert.equal(sh('git rev-parse HEAD', {cwd: dir}), v1Sha);
226241
assert.equal(exitedWith, 75);
227242
} finally {
228-
await fs.rm(dir, {recursive: true, force: true});
243+
await cleanupTmp(dir);
229244
}
230245
});
231246

@@ -259,7 +274,7 @@ describe(__filename, function () {
259274
assert.equal(states.at(-1)!.lastResult!.outcome, 'rollback-failed');
260275
assert.equal(exitedWith, 75);
261276
} finally {
262-
await fs.rm(dir, {recursive: true, force: true});
277+
await cleanupTmp(dir);
263278
}
264279
});
265280
});

0 commit comments

Comments
 (0)