Skip to content

Worker: ensure workers are released after an error on run:complete #685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 10, 2024

Conversation

josephjclark
Copy link
Collaborator

Short Description

If the run:complete event error happens to throw, the worker will not close out a Run properly, ie will not free up capacity and claim another run.

This is causing the worker to "jam" silently.

I do not know WHY run:complete is timing out on us right now (see these logs). And we'll have to work that out. But at least the worker shouldn't die silently any more.

I've also added a little bit more logging around this stuff. Most of the time it's junk but it may help us diagnose when the worker isn't releasing threads from the pool.

Related issue

None raised, see slack.

@josephjclark josephjclark changed the title W Worker: ensure workers are released after an error on run:complete May 10, 2024
@josephjclark
Copy link
Collaborator Author

@taylordowns2000 This is ready to go and the image is built. Can we make sure we run the end to end tests on it before going live?

@josephjclark josephjclark merged commit 75f9087 into main May 10, 2024
6 checks passed
@josephjclark josephjclark deleted the worker-issues branch May 10, 2024 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

1 participant