-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build: improve concurrency limit retry message #9011
Comments
Are the logs indicating we are retrying builds over and over until failing the build? That is, are these legitimate concurrency retries? Some of the reports we've collected on this bug make it seem like the error is in our retry/concurrency logic, and the issue doesn't seem related to how we are communicating build retries. For example, this build and the prior builds for this project all seem suspect https://readthedocs.org/projects/overte-docs/builds/16548600/ The last passing build took 94 seconds and was several days before the first failed concurrency limited builds. This project does have translations that would cause concurrency limit conflict, but builds never being retried only seems to be an issue after last release. |
I'm not talking about a bug on concurrency limits here, just making the message shown to the user better. |
I understand, but I think focusing on the error message is focusing on a symptom of the error instead of the error itself. Users are more confused about the builds never retrying than they are how many times the build has been retried. In the case above, the error message wouldn't help the user, the builds still ultimately never retry. If we address the underlying issue of builds not retrying, the user likely won't see the count on the error message anyways. I've asked users about retries and they don't notice the retry mechanism, they just wait for the build to finish. |
Yeah, I know. I'm not focusing on the error message instead of the error. They are just two different problems. One is the retry not working and a different issue (labeled as "Improvement") is the error message. It could happen the user has reached the concurrency limit during 25 * 5m, and that particular build won't retry again, communicating a wrong message. That case is this issue about. |
Just noting from #9148 that the actual exception would be |
When a build is concurrency limited, we show the maximum number of concurrent builds. Besides that limit, it would be good to show the number of times the build has retried already and the number of remaining retries.
After that, if the build reaches the maximum retried times, we should update the message accordingly to communicate the build has reached this number.
We need a message like that. Otherwise, users believe the build is going to be retried again when it's not gonna be retried because the retry limit was hit (note that having a high number of retries could interfere with #8269 and it could be killing the builds)
Places in the code:
readthedocs.org/readthedocs/doc_builder/exceptions.py
Lines 52 to 54 in 60be1bd
readthedocs.org/readthedocs/projects/tasks/builds.py
Line 226 in 60be1bd
Note there is a
MaxRetriesExceededError
exception that we can handle to update the build's error message when we reached the limit: https://docs.celeryproject.org/en/stable/userguide/tasks.html#retryingThe text was updated successfully, but these errors were encountered: