Migrate to whatsapp-rust 0.5 stack#15
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
3c15391 to
7ab1701
Compare
…ates cargo update + dependency configuration changes required for the whatsapp-rust 0.5 series: - getrandom 0.3 → 0.4 (transitively required by rand 0.10) - rand 0.9 → 0.10, with explicit "sys_rng" feature so make_rng falls back to SysRng on wasm32 (no thread_rng under wasm) - wacore-binary: opt-out of "default-features" to drop the "simd" feature it now defaults to (we keep our own simd128 target-feature flag and don't need its portable_simd-based scalar fallback paths) - hashify 0.2.9 with "force-32bit" feature, declared as a direct dep so Cargo unifies features across the tree. Without this, hashify's PHF proc-macro picks the 64-bit "large" lookup module based on the host's pointer width, which generates code that does `key_hash as usize` (64-bit on host but 32-bit on wasm32) — silently truncating the hash and breaking lookups for some keys (e.g. tokens "receipt", "from", "id" all returned None on wasm32 while "message" worked). Forcing 32-bit mode keeps all u32 arithmetic and produces consistent lookups. Drop redundant `#[wasm_bindgen(skip)]` attributes on private cache fields of `InternalBinaryNode`. The macro already skips private fields, and wasm-bindgen 0.2.121's accounting for the `skip` attribute on already-skipped fields trips an `unused_variables` lint via the generated `let skip: ();` marker — clippy promotes that to an error under `-D warnings`.
7ab1701 to
ef7efaf
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/key_helper.rs (1)
70-75: 💤 Low value
rng()called twice ingenerate_signed_pre_key— consider reusing one instance.Each
rng()call invokesrand::make_rng::<StdRng>(), which seeds a fresh ChaCha12 instance fromSysRng(agetrandomsyscall /window.crypto.getRandomValuescall on WASM). Two separate OS-entropy seedings in one function body is unnecessary.♻️ Suggested refactor
- let pre_key_pair = CoreKeyPair::generate(&mut rng()); + let mut r = rng(); + let pre_key_pair = CoreKeyPair::generate(&mut r); let pre_key_public_bytes = pre_key_pair.public_key.serialize(); let signature = identity_private_key - .calculate_signature(&pre_key_public_bytes, &mut rng()) + .calculate_signature(&pre_key_public_bytes, &mut r) .map_err(map_err)?;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/key_helper.rs` around lines 70 - 75, The function generate_signed_pre_key currently calls rng() twice (once for CoreKeyPair::generate and once for identity_private_key.calculate_signature), causing two separate OS entropy seedings; change it to create a single RNG instance (e.g., let mut rng = rng()) at the start of the function and pass that same mutable rng reference to CoreKeyPair::generate and to identity_private_key.calculate_signature so pre_key_pair, pre_key_public_bytes, and signature all use the same RNG instance.benches/signal.ts (1)
240-252: ⚖️ Poor tradeoffDecrypt benchmark also includes encrypt cost in the measured region.
Each iteration of these "Decrypt WhisperMessage" benches performs a full
encrypt → decryptWhisperMessagecycle inside the timed body, so the reported numbers reflect encrypt+decrypt rather than decrypt alone. The relative comparison between Rust WASM andlibsignal-nodeis still fair (both pay the same overhead), but the bench label and any standalone reading of "decrypt cost" will be misleading.If you want a closer-to-pure decrypt measurement, pre-build a queue of ciphertexts in setup (sized to the iteration count) and only
decryptWhisperMessageinside the timed body — keeping in mind that re-decrypting the same ciphertext won't work because the ratchet steps forward.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benches/signal.ts` around lines 240 - 252, The benchmark titled "Decrypt WhisperMessage" measures both encryption and decryption because each iteration calls decWasm.alice.encrypt / decLib.alice.encrypt inside the timed bench before calling decWasm.bob.decryptWhisperMessage / decLib.bob.decryptWhisperMessage; change the setup so ciphertexts are pre-generated outside the measured section (e.g., in a before/prepare step or by building a queue of messages using decWasm.alice.encrypt and decLib.alice.encrypt sized to the bench iterations) and have the bench body only call decWasm.bob.decryptWhisperMessage or decLib.bob.decryptWhisperMessage (popping the next ciphertext from the queue), taking care not to reuse the same ciphertext for multiple decrypts because the ratchet advances.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@benches/signal.ts`:
- Around line 240-252: The benchmark titled "Decrypt WhisperMessage" measures
both encryption and decryption because each iteration calls
decWasm.alice.encrypt / decLib.alice.encrypt inside the timed bench before
calling decWasm.bob.decryptWhisperMessage / decLib.bob.decryptWhisperMessage;
change the setup so ciphertexts are pre-generated outside the measured section
(e.g., in a before/prepare step or by building a queue of messages using
decWasm.alice.encrypt and decLib.alice.encrypt sized to the bench iterations)
and have the bench body only call decWasm.bob.decryptWhisperMessage or
decLib.bob.decryptWhisperMessage (popping the next ciphertext from the queue),
taking care not to reuse the same ciphertext for multiple decrypts because the
ratchet advances.
In `@src/key_helper.rs`:
- Around line 70-75: The function generate_signed_pre_key currently calls rng()
twice (once for CoreKeyPair::generate and once for
identity_private_key.calculate_signature), causing two separate OS entropy
seedings; change it to create a single RNG instance (e.g., let mut rng = rng())
at the start of the function and pass that same mutable rng reference to
CoreKeyPair::generate and to identity_private_key.calculate_signature so
pre_key_pair, pre_key_public_bytes, and signature all use the same RNG instance.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cf65248a-f89c-494a-950b-76ba73dbb311
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
Cargo.tomlbenches/signal.tssrc/binary.rssrc/curve.rssrc/group_cipher.rssrc/key_helper.rssrc/noise_session.rssrc/session_builder.rssrc/session_cipher.rssrc/storage_adapter.rs
In rand 0.10: - OsRng was removed from rand::rngs (now in getrandom as SysRng) - TryRngCore trait was renamed to TryRng (in rand_core) - The .unwrap_err() shorthand on OsRng is gone Replace `OsRng.unwrap_err()` with `rand::make_rng::<StdRng>()`, which is the idiom whatsapp-rust 0.5 itself uses. make_rng seeds a StdRng (ChaCha-based CryptoRng) from SysRng each call. Slightly more overhead than direct OsRng but functionally equivalent and consistent with upstream's signal session code. Update key_helper, curve, group_cipher, session_builder, and session_cipher to the new idiom.
… redesign wacore-libsignal 0.5 reshaped the session storage traits: - SessionStore gained a non-destructive has_session(&self) probe. Implemented by delegating to load_session — adequate for our cache, which uses Rc<RefCell> interior mutability so load doesn't mutate observable state. - store_session and store_sender_key now take `record: T` by value (was `&T`). Owned record means callers can't accidentally retain a reference after the store returns. We move the record straight into the cache (saves one CoreSessionRecord/CoreSenderKeyRecord clone per store — non-trivial since these carry chain keys + message keys). - load_sender_key now takes `&self` (was `&mut self`). Our cache lookup already used interior mutability, so the receiver change is a no-op.
wacore-binary 0.5 reshaped the borrowed node types for performance:
- Cow<'a, str> for tag/string fields → NodeStr<'a> { Borrowed(&str),
Owned(CompactString) }. CompactString avoids heap allocation for
strings up to 12 bytes (on 32-bit targets), which covers most
WhatsApp protocol tokens and short JIDs.
- Vec<(Cow<str>, ValueRef)> for attrs → AttrsRef<'a> { Empty,
Slice(Box<[...]>) }. Boxed slice for the populated case; Empty
variant skips the allocation entirely.
- NodeContentRef::Nodes(Box<Vec<NodeRef>>) → Box<[NodeRef]>. Boxed
slice instead of boxed Vec.
- ValueRef::as_str() now returns Cow<'_, str> unconditionally instead
of Option<&str> — the old None branch (for Jid encoded as bytes) is
gone since Jid carries its own decoded form.
Update js_to_node_ref to construct the new types and convert_attrs to
iterate AttrsRef directly.
- wacore_binary::consts::NOISE_START_PATTERN → NOISE_PATTERN_XX. The constant was renamed when 0.5 split out separate XX, IK, and XXfallback handshake patterns. - NoiseCipher::decrypt_with_counter (allocating, returns Vec) → decrypt_in_place_with_counter (in-place, takes &mut buffer). Wrap the input slice in a fresh Vec for the in-place buffer; the allocation count is unchanged but the AES-GCM path is in-place, matching the encrypt side's existing pattern.
Three small wins on the wasm→JS conversion path: - byte_array.copy_to(&mut vec![0; len]) → byte_array.to_vec() in js_to_node_ref. Skips the zero-init memset before overwriting the buffer. - Uint8Array::new_with_length(n) + result.copy_from(&bytes) → Uint8Array::from(slice) in encode_node, the Bytes content getter, and the bytes_to_uint8array helper used by SessionCipher's encrypt and decrypt return paths. Same FFI semantics, cleaner API.
The original bench shared one Alice/Bob pair across Encrypt, Decrypt,
and Round-trip benches. This was always borderline-correct, but became
broken once encrypt got fast enough for mitata to enable batch mode:
- mitata's min_cpu_time = 642ms / encrypt cost
- Pre-migration: ~169µs/iter → ~3800 iterations during the Encrypt bench
- Post-migration: ~17µs/iter (batch-amortised) → ~38000 iterations
When the Encrypt bench advances Alice's chain solo past 25_000 messages
without Bob seeing them, the next Decrypt bench (Alice encrypts → Bob
decrypts) has Bob trying to skip > MAX_FORWARD_JUMPS (25_000) messages
forward in the chain, which libsignal correctly rejects as
InvalidMessage.
Fix: each `summary()` block builds a fresh (Alice, Bob) pair via
make{Wasm,Lib}Pair() helpers. The Encrypt bench can advance its own
Alice's chain freely; the Decrypt and Round-trip benches operate on
independent pairs that start fresh.
Verified: 5/5 runs of `node --expose-gc benches/signal.ts` pass with
consistent timings (~17µs encrypt, 30-250µs decrypt, ~800µs round-trip
on the Rust WASM side).
Reproducer for the underlying limit:
for (let i = 0; i < 25500; i++) await alice.encrypt(typical);
const enc = await alice.encrypt(typical);
await bob.decryptWhisperMessage(enc.body); // throws InvalidMessage
The threshold is exactly libsignal's MAX_FORWARD_JUMPS = 25_000.
ef7efaf to
68453f6
Compare
Summary
Bumps the
wacore-*/waprotodeps to the 0.5 series and adapts the bridge to the resulting API changes.Notable changes
hashifycross-compile fix: forcesforce-32bitso the PHF token lookup works onwasm32(otherwise some tokens silently returnNonedue to au64 as usizetruncation in the generated code).SessionStoreredesign: newhas_sessionmethod,recordtaken by value (saves one clone per store).wacore-binarytypes:Cow<str>→NodeStr,Vec<(K,V)>→AttrsRef,Box<Vec<NodeRef>>→Box<[NodeRef]>.summary()inbenches/signal.tsnow gets its own session pair. The shared pair was advancing pastMAX_FORWARD_JUMPS = 25_000once encrypt got fast enough for mitata's batch mode.Test plan
bun test— 163 pass / 0 failbun run build— cleannode --expose-gc benches/signal.ts— 5/5 runs consistentSummary by CodeRabbit
Refactor
Chores