-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task to finish build due inactivity has an edge case #4386
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is there a way to replicate this locally? |
@stsewd yes,
and you will hit the edge case. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Still valid, but very low priority |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Still valid bot. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Still valid. |
This function should check if we have a celery task queued and mark the build as finished if not. Otherwise, we may be set as finished builds that are being re-tried because of concurrency. Also, using the queue we will be 100% accurate in our decision and it does not require to be based on "time since the build was triggered" or similar. |
Yesterday while deploying the community site we stop the celery queues for a while and we accumulate many tasks on it. I triggered a couple of builds and they weren't executed by any worker in a period of 1080 seconds (900s + 20%). Because of this reason, these builds were taken by
finish_inactive_builds
periodic task:https://github.com/rtfd/readthedocs.org/blob/1f8351b593394efc9f20533776f4e48a018bbb56/readthedocs/projects/tasks.py#L1195-L1206
This task, marked them as FAILED and added the error message to it. After a while I saw my build like this:
Then, when the workers recovered and started executing pending builds from the queue, these builds were performed and now the same build id shows a completely different message: COMPLETED
One possible way to solve this is to save the
task_id
into theBuild
object and before marking the build as FAILED we could check through celery if thattask_id
is still alive or pending for a worker to take it. As far as I know, at this point we have no way to link a Build object with a Celery task.The text was updated successfully, but these errors were encountered: