Skip to content

fix(audit batch 3): 3 P1 from post-merge repository audit (SIG-A1, SOCK-01, TST-07)#534

Merged
rsalcara merged 1 commit into
developfrom
fix/audit-batch-3-p1
Jun 10, 2026
Merged

fix(audit batch 3): 3 P1 from post-merge repository audit (SIG-A1, SOCK-01, TST-07)#534
rsalcara merged 1 commit into
developfrom
fix/audit-batch-3-p1

Conversation

@rsalcara

@rsalcara rsalcara commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

Three P1 bugs surfaced by the 8-agent re-audit of `develop @ 1e8cd36` (after #529 #532 #533 landed). Each verified against the actual code + spec before fixing.

SIG-A1 — `extractIdentityFromPkmsg` reads the wrong protobuf field

libsignal.ts:222 was checking `fieldNumber === 5` for the identity key. Per libsignal spec (WhisperTextProtocol.proto):

```
field 1: preKeyId (uint32, varint, wireType 0)
field 2: baseKey (bytes 33, wireType 2)
field 3: identityKey (bytes 33, wireType 2) ← what we wanted
field 4: message (bytes, wireType 2)
field 5: registrationId (uint32, varint, wireType 0)
field 6: signedPreKeyId (uint32, varint, wireType 0)
```

Field 5 is registrationId — a varint, not a 33-byte key. Combined with the wire-type filter (`wireType === 2`) the function NEVER matched anything and ALWAYS returned `undefined`. Direct consequence: identity-change detection in `decryptMessage`, the `identity.changed` event, and reinstall-triggered session cleanup were all dead code in production.

Fix: `fieldNumber === 3` + rewrite the docblock with exact spec field assignments.

SOCK-01 — `query()` returns `undefined` on timeout

`waitForMessage` swallows the Boom timedOut and returns `undefined`. `query()` re-exported that contract, leaving callers unable to distinguish timeout from a legit empty response. Concrete bugs already in the codebase:

Caller Broken behaviour
`uploadPreKeys` Logs "uploaded successfully" and resets `lastUploadTime` on timeout — silently exhausting server-side pre-keys
`appPatch` Bumps `app-state-sync-version` on timeout — drifts LTHash from server
`fetchPrivacySettings` Destructures `undefined` → opaque TypeError far from source
`assertSessions` Sets `didFetchNewSession = true` without injecting a single session

`query()` now converts the `undefined` sentinel into a typed Boom (`statusCode: DisconnectReason.timedOut`, `data: {msgId, timeoutMs}`). `waitForMessage` keeps its current shape — only `query()`'s contract tightens, which is the entrypoint the audit consumers use.

TST-07 — H4 `it.failing` tests are dead trackers

`signal.delete-migrate-race.test.ts:62,94` exercise a local `addTransactionCapability` mock — NOT the real `transactWith({ records })` path the H4 fix uses (libsignal.ts:777). Same pattern as H8/H9 placeholders we already demoted. Demoted to `it.skip` with comment; the H4 fix remains live.

SC-01 — yarn.lock checksums (NOT addressed in this PR)

Audit flagged 45 entries (`@esbuild/` + `@unrs/resolver-binding-`) without `checksum:` lines. These are optional native deps for OS/arch combinations yarn can't fetch on this Windows machine. Fix is to run `yarn install` on a Linux runner / CI so the lockfile gets the missing checksums and `--immutable` installs verify integrity properly. Tracked as CI-side follow-up.

Test plan

  • `tsc --noEmit -p tsconfig.json` — exit 0
  • `eslint` on all touched files — exit 0
  • Touched test suites pass; lone failing suite (`signal-typed-backend`) is pre-existing `better-sqlite3` binding issue on Windows
  • Smoke: reinstall WhatsApp on a paired device, send a message → verify `identity.changed` event fires (was silent before SIG-A1)
  • Smoke: trigger a server timeout (kill server mid-query) → verify caller sees Boom timedOut instead of an opaque TypeError

Customizations untouched

Carousel / list / button / poll / view-once / biz quality_control paths intact. SIG-A1 only changes a field number that was wrong; SOCK-01 tightens an existing contract without expanding it.

🤖 Generated with Claude Code


Summary by cubic

Fixes three P1 issues from the audit: restores identity-change detection by parsing the correct Signal field, surfaces query timeouts as typed errors, and skips two placeholder tests that don’t reflect production behavior.

  • Bug Fixes

    • SIG-A1: extractIdentityFromPkmsg now reads identityKey from field 3 (was 5). Restores identity-change detection and related cleanup paths.
    • SOCK-01: query() now throws a Boom timeout (statusCode: DisconnectReason.timedOut, data: { msgId, timeoutMs }``) instead of returning undefined.
    • TST-07: Demoted two H4 it.failing tests to it.skip with a note; they target a mock path, not the real transactWith({ records }).
  • Migration

    • If you relied on query() returning undefined for timeouts, catch the Boom error (DisconnectReason.timedOut) instead.

Written for commit 15e97b0. Summary will update on new commits.

Review in cubic

8-agent re-audit of develop @ 1e8cd36 (after #529 #532 #533 landed)
surfaced these 3 critical bugs. Each verified against the actual code
and spec before fixing.

## P1 — SIG-A1 — `extractIdentityFromPkmsg` reads wrong protobuf field

libsignal.ts:222 was checking `fieldNumber === 5` for the identity key
in a PreKeySignalMessage. Per the libsignal spec
(WhisperTextProtocol.proto, `PreKeySignalMessage`):

  field 1: preKeyId       (uint32, varint, wireType 0)
  field 2: baseKey        (bytes 33, wireType 2)
  field 3: identityKey    (bytes 33, wireType 2)  ← what we wanted
  field 4: message        (bytes, wireType 2)
  field 5: registrationId (uint32, varint, wireType 0)
  field 6: signedPreKeyId (uint32, varint, wireType 0)

Field 5 is registrationId — a varint, not a 33-byte key. Combined
with the wire-type filter (`wireType === 2`) the function NEVER
matched anything and ALWAYS returned `undefined`. Direct consequence:
identity-change detection in `decryptMessage`, the `identity.changed`
event, and the reinstall-triggered session cleanup were all dead
code in production. Anyone reinstalling WhatsApp could decrypt with
the new identity key without us noticing.

Fix: change to `fieldNumber === 3` and rewrite the docblock with the
exact field assignments from the spec so this doesn't regress.

## P1 — SOCK-01 — `query()` returned `undefined` on timeout

socket.ts: `waitForMessage` swallows the Boom timedOut and returns
`undefined`. `query()` re-exported that contract to its callers, who
mostly couldn't distinguish a timeout from a legitimate empty
response. Concrete damage already in the codebase:

- `uploadPreKeys` logged "uploaded pre-keys successfully" and reset
  `lastUploadTime` on timeout — silently exhausting the server-side
  pre-key pool;
- `appPatch` bumped `app-state-sync-version` on timeout — drifting
  LTHash from the server's;
- `fetchPrivacySettings` destructured `undefined` → opaque TypeError
  far from the source;
- `assertSessions` set `didFetchNewSession = true` without injecting
  a single session.

`query()` now converts the `undefined` sentinel into a typed Boom
(`statusCode: DisconnectReason.timedOut`, `data: {msgId, timeoutMs}`).
Callers that catch this can retry; callers that don't surface a
clear failure instead of a downstream mystery.

`waitForMessage` keeps its current shape — only `query()`'s contract
tightens, which is the entrypoint the audit-flagged consumers use.

## P1 — TST-07 — H4 `it.failing` tests are dead trackers

`signal.delete-migrate-race.test.ts:62,94` were `it.failing` suites
authored before the H4 fix shipped (Stage 2, upstream WhiskeySockets#2572 — now at
`libsignal.ts:777` and friends). They exercise a local
`addTransactionCapability` mock with synthetic addresses, NOT the
real `transactWith({ records: [{type:'session', id}] })` path the
production fix uses. Like the H8/H9 placeholders we already
demoted, they keep "failing as expected" no matter how good the
production code gets, signalling a phantom open issue forever.

Demoted to `it.skip` with a comment explaining the gap. The H4 fix
remains live in production.

## SC-01 — yarn.lock checksums for 45 platform binaries

The audit flagged 45 entries (`@esbuild/*` + `@unrs/resolver-binding-*`)
without `checksum:` lines. Verified: `awk` count returns the missing
entries. These are optional native deps for OS/arch combinations that
yarn can't fetch on this Windows machine — so a local `yarn install`
won't populate them. The fix is to run `yarn install` on a Linux
runner (or via CI) so the lockfile gets the missing checksums and
`--immutable` installs verify integrity properly. Not addressed in
this PR; tracked as a CI-side follow-up.

## Validation

- `tsc --noEmit -p tsconfig.json` — exit 0
- `eslint` on all touched files — exit 0
- Touched test suites pass; the lone failing suite locally is
  `signal-typed-backend` (better-sqlite3 native bindings on Windows
  toolchain, pre-existing)

## Customizations untouched

Carousel / list / button / poll / view-once / biz quality_control
paths intact. SIG-A1 only changes the field number that was already
wrong; SOCK-01 hardens an existing contract without expanding it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 10, 2026 21:27
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a42327c-1863-40ee-9cc6-994e71b48fb1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/audit-batch-3-p1

Warning

Billing warning: we have not been able to collect payment for this subscription for more than 72 hours. Please update the payment method or pay any pending invoices in Billing to avoid service interruption.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown

Thanks for opening this pull request and contributing to the project!

The next step is for the maintainers to review your changes. If everything looks good, it will be approved and merged into the main branch.

In the meantime, anyone in the community is encouraged to test this pull request and provide feedback.

✅ How to confirm it works

If you’ve tested this PR, please comment below with:

Tested and working ✅

This helps us speed up the review and merge process.

📦 To test this PR locally:

# NPM
npm install @whiskeysockets/baileys@rsalcara/InfiniteAPI#fix/audit-batch-3-p1

# Yarn (v2+)
yarn add @whiskeysockets/baileys@rsalcara/InfiniteAPI#fix/audit-batch-3-p1

# PNPM
pnpm add @whiskeysockets/baileys@rsalcara/InfiniteAPI#fix/audit-batch-3-p1

If you encounter any issues or have feedback, feel free to comment as well.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Re-trigger cubic

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses three P1 findings from the post-merge audit by restoring Signal identity-key extraction, making socket query() timeouts observable as typed errors, and demoting two non-actionable placeholder tests.

Changes:

  • Fix Signal PreKeySignalMessage parsing by extracting identityKey from protobuf field 3 (not 5), restoring identity-change detection paths.
  • Tighten query() timeout behavior by throwing a typed Boom (instead of returning undefined) when waitForMessage times out.
  • Convert two it.failing tests to it.skip with an explanatory note since they don’t exercise the production locking path.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/Socket/socket.ts Convert query() timeout sentinel (undefined) into a typed Boom timeout error.
src/Signal/libsignal.ts Correct identity-key extraction to read protobuf field 3 and update the spec docblock.
src/tests/Utils/signal.delete-migrate-race.test.ts Demote two placeholder H4 tests from it.failing to it.skip with rationale.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Signal/libsignal.ts
@rsalcara rsalcara merged commit 7a13259 into develop Jun 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants