test: deflake TraxAuditWriterTests by polling sink state#43
Merged
Conversation
Replaces fixed Task.Delay(300/500/800) waits across four tests with a WaitUntilAsync helper that polls the sink's batches or attempts counter until the expected condition holds. The 10-second timeout is a safety ceiling, not a sleep — the tests finish as soon as the flush loop or retry path lands the expected effect. Affects: - Drains_BatchFull_FlushesImmediately - Drains_PartialBatch_FlushesOnInterval - SinkThrowsBeyondMaxRetries_DropsBatch - SinkThrows_Retries_ThenSucceeds The intentional Task.Delays in Stop_DuringRetryBackoff_PropagatesCancellation and Drains_QuietChannel_Stops_WithoutFlushing are left untouched; those verify behavior during a known-duration sleep, not after one.
The helper called WebSocket.ReceiveAsync once and parsed whatever bytes came back. That fails non-deterministically when: - A text message arrives in multiple frames (EndOfMessage=false on the first read), so the partial buffer fails JSON parsing. - The host returns a zero-length frame before the actual payload, so the empty buffer triggers "input does not contain any JSON tokens". - The host closes the socket while the test is waiting, in which case the test surfaced a confusing JsonReaderException instead of a useful diagnostic. Replaces both ReceiveAsync helpers (in SubscriptionE2ETests and SubscriptionPrincipalPropagationE2ETests) and the local ReceiveNextAsync loop with a version that: - Accumulates fragmented frames into a MemoryStream until EndOfMessage - Throws a clear InvalidOperationException with CloseStatus details on unexpected close frames - Skips zero-length frames during accumulation rather than failing on empty input Bumps the receive timeout in SubscriptionE2ETests from 5 to 10 seconds to match the sibling file and provide a single uniform CI ceiling. Surfaced by a CI run of the audit-writer deflake PR where RepeatedEvents_SamePrincipalEveryTime hit the empty-buffer path.
Same single-shot ReceiveAsync pattern as the previous commit, lower flake probability because the test catches WebSocketException, but a JsonReaderException on a partial buffer escapes the catch and fails the rejection check for the wrong reason. Mirrors the accumulate- until-EndOfMessage handling.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
This PR is included in version 1.25.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces fixed
Task.Delay(300/500/800)waits across four tests with aWaitUntilAsynchelper that polls the sink'sBatchesorAttemptscounter until the expected condition holds. Each test finishes as soon as the flush loop or retry path lands the expected effect.Affected tests:
Drains_BatchFull_FlushesImmediatelyDrains_PartialBatch_FlushesOnIntervalSinkThrowsBeyondMaxRetries_DropsBatchSinkThrows_Retries_ThenSucceedsThe 10-second timeout is a safety ceiling, not a sleep. The intentional
Task.Delays inStop_DuringRetryBackoff_PropagatesCancellationandDrains_QuietChannel_Stops_WithoutFlushingare left untouched: those verify behavior during a known-duration sleep, not after one.Local run for these 6 tests drops from ~2.4s to ~540ms.
Test plan
dotnet build— zero warningsdotnet csharpier check .— cleandotnet test --filter TraxAuditWriterTests— 6/6 passing in 540ms