Skip to content

fix(webhooks): keep retries scheduled past the publish boundary#285

Merged
adekbadek merged 3 commits into
mainfrom
fix/webhooks-max-retries-flaky
Jun 11, 2026
Merged

fix(webhooks): keep retries scheduled past the publish boundary#285
adekbadek merged 3 commits into
mainfrom
fix/webhooks-max-retries-flaky

Conversation

@adekbadek

Copy link
Copy Markdown
Member

All Submissions:

Changes proposed in this Pull Request:

Fixes the intermittently-failing Newspack_Test_Webhooks::test_request_max_retries, which was red-lighting unrelated PRs on unlucky CI timing.

The failure was a real timing bug, not a test problem. The webhook retry mechanism parks a failed request as a future WordPress post and re-processes it when WordPress republishes it. WordPress core (wp_insert_post(), wp-includes/post.php) flips a future post straight to publish when its date is less than MINUTE_IN_SECONDS (60s) away. schedule_request() scheduled the shortest backoff at exactly strtotime( '+1 minutes' ) = +60s, sitting right on that edge. When a wall-clock second ticked between computing that timestamp and WordPress' own gmdate() check, the gap became 59s and the post published immediately, firing an extra transition_post_statusprocess_request → recorded error. The retry counter (count( $errors )) then hit MAX_RETRIES one iteration early, so the request was trashed before its last attempt.

Empirically: replaying the old schedule math flipped to publish ~0.2% of the time; with the fix, 0/3000.

Changes:

  • schedule_request() clamps the delay to a 1-minute minimum and adds a named SCHEDULE_HEADROOM_SECONDS (5s) buffer, so a scheduled retry reliably clears the publish boundary regardless of sub-second timing.
  • Derives both post_date and post_date_gmt from a single gmdate() (WordPress forces the PHP timezone to UTC), dropping a no-op strtotime() round-trip and its phpcs:ignore.
  • Adds test_scheduled_retry_clears_publish_boundary — a deterministic regression test asserting the scheduled offset clears MINUTE_IN_SECONDS.

Behaviour change is publisher-invisible: retries land ~5s later, never sooner.

Closes # .

How to test the changes in this Pull Request:

  1. From plugins/newspack-plugin, run the new regression test: n test-php --filter test_scheduled_retry_clears_publish_boundary (green; it fails on main with Failed asserting that 60 is greater than 60).
  2. Loop the previously-flaky test to confirm stability: for i in $(seq 1 30); do n test-php --filter test_request_max_retries || break; done (all green).
  3. Run the full webhooks class: n test-php --filter Newspack_Test_Webhooks (26 tests, 56 assertions).

Other information:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes, as applicable?
  • Have you successfully run tests with your changes locally?

WordPress publishes a 'future' post immediately when its date is less
than MINUTE_IN_SECONDS away, and the shortest webhook retry backoff was
scheduled at exactly +60s. A wall-clock second elapsing before WordPress'
own check could shave the gap to 59s, publishing the retry early, firing
an extra processing pass, and burning one of the 15 attempts. This made
Newspack_Test_Webhooks::test_request_max_retries flaky (trashing the
request one iteration early). Clamp the delay to a 1-minute minimum and
add a few seconds of headroom so a scheduled retry reliably clears the
boundary, and add a deterministic regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 11, 2026 10:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a timing edge case in the webhook retry scheduler where a retry scheduled at exactly +60s could be immediately published by WordPress core (when it becomes < MINUTE_IN_SECONDS away), intermittently consuming an extra retry and causing test_request_max_retries to fail.

Changes:

  • Add a small scheduling headroom (SCHEDULE_HEADROOM_SECONDS) and compute scheduled timestamps via time() + delay + headroom to reliably keep retries in future.
  • Simplify post_date/post_date_gmt calculation by deriving both from a single gmdate() value.
  • Add a deterministic regression test asserting scheduled retries clear the MINUTE_IN_SECONDS publish boundary.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
plugins/newspack-plugin/includes/data-events/class-webhooks.php Adds headroom and clamps minimum delay to prevent “future → publish” boundary flips when scheduling retries.
plugins/newspack-plugin/tests/unit-tests/webhooks.php Adds a regression test covering the publish-boundary timing issue.

Comment thread plugins/newspack-plugin/tests/unit-tests/webhooks.php Outdated
Comment thread plugins/newspack-plugin/includes/data-events/class-webhooks.php
adekbadek and others added 2 commits June 11, 2026 13:36
Wrap the new regression test in try/finally so the action-scheduler
filter is removed even if an assertion fails, and assert a request
exists before indexing it. Document that schedule_request() clamps the
delay to a 1-minute minimum.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@adekbadek adekbadek marked this pull request as ready for review June 11, 2026 11:54
@adekbadek adekbadek requested a review from a team as a code owner June 11, 2026 11:54
@adekbadek adekbadek merged commit 91643e3 into main Jun 11, 2026
7 checks passed
@adekbadek adekbadek deleted the fix/webhooks-max-retries-flaky branch June 11, 2026 11:55
@github-actions

Copy link
Copy Markdown

Hey @adekbadek, good job getting this PR merged! 🎉

Now, the needs-changelog label has been added to it.

Please check if this PR needs to be included in the "Upcoming Changes" and "Release Notes" doc. If it doesn't, simply remove the label.

If it does, please add an entry to our shared document, with screenshots and testing instructions if applicable, then remove the label.

Thank you! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants