Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build of git tag failed due to "Concurrency limit reached" but didn't retry #8989

Closed
connorimes opened this issue Mar 2, 2022 · 9 comments
Closed
Labels
Needed: replication Bug replication is required

Comments

@connorimes
Copy link

Details

Expected Result

The build succeeds, or retries the build after a short period, then deploys the docs for the software release (in this case v0.1.0).

Actual Result

The page says

Error

Concurrency limit reached (2), retrying in 5 minutes.

but the build doesn't appear to retry and the tagged release docs are not deployed. It's been almost 20 hours since the build failed.

The problem appears similar to #8772.

@humitos humitos added the Needed: replication Bug replication is required label Mar 2, 2022
@humitos
Copy link
Member

humitos commented Mar 2, 2022

I'm not sure what happened here but definitely something weird. Checking the logs I don't see any error, but I can see that particular build was not executed properly for some strange reason.

We haven't noticed a problem with the concurrency limit lately, so I'm not sure what could have happened.

Please, trigger a new build for that particular version and let us know if the problem persists over time.

@humitos humitos added the Needed: more information A reply from issue author is required label Mar 2, 2022
@connorimes
Copy link
Author

Thanks for the quick response. How do I trigger it to try again? I don't see a button anywhere in the RTD web interface to do so.

@no-response no-response bot removed the Needed: more information A reply from issue author is required label Mar 2, 2022
@humitos
Copy link
Member

humitos commented Mar 2, 2022

Any maintainer of the project can trigger another build for a particular version from https://readthedocs.org/projects/energymon-py/builds/ . If you don't see the button there, maybe you are not a maintainer?

@connorimes
Copy link
Author

Oh, I see it now - I was a little confused by the UI. I didn't realize that "Build Version:" was a button and thought the dropdown list next to it was just going to filter the builds listed below. I expected a rebuild button to be next to each build or within their build status pages.

My recollection is that leading up to the original problem, two builds---a stable and a v0.1.0 build---sat in a Triggered state for >=20 min (I was AFK for awhile) before the stable one ran successfully and the v0.1.0 build entered the Failed state. They may have initially been queued behind other latest builds.

I triggered a new build after your previous message, which has now been in the Triggered state for >30 min, and now see another build was triggered at about ~23 min later (~12:26 AM EST) . However, I only manually triggered one build - not sure where the second one came from. The manually triggered build definitely wasn't queued behind any other builds.

@connorimes
Copy link
Author

A third build was triggered, and I'm thinking these extra builds were initiated by my refreshing the web page after initiating the first manual build. There doesn't appear to be any harm beyond wasting resources.

All three builds have now passed and the v0.1.0 docs have deployed. Unless you want to debug the problem further, I think you can close this issue. Thanks again for the help.

@humitos
Copy link
Member

humitos commented Mar 3, 2022

I'm sorry for all the confusion here. Multiple and different things have happened here and this generated a lot of confusion to all of us 😓

First of all, we had a problem yesterday that our AWS instances weren't escalating automatically based on builds demand because of a bug that it's not solved. Because of that, a lot of builds were queued and they took a lot to be picked by the workers.

In the meantime, while I was testing that problem, I manually triggered some builds for your project (that's why you saw more builds than the ones you triggered by yourself).

Finally, everything should be working fine now and you should not experience this issue anymore. Please, let us know if any of these issues continue happening.

@humitos
Copy link
Member

humitos commented Mar 3, 2022

I think that the original build linked in the description (https://readthedocs.org/projects/energymon-py/builds/16234381/) exceeded the max retry number: 25, and that's why it was "never" retried and changed the message.

@connorimes
Copy link
Author

Thanks for the detailed update, and more generally for maintaining RTD.

There were 7 commits pushed to the master branch that day (4 of which were part of a merge from a fork, so shouldn't have been built separately), so I think there should've been 4 latest builds triggered in a fairly short period of time. Then then the tag was pushed which I believe would've triggered the stable and v0.1.0 builds. Each build takes <1 min to complete. 25 retries (at 5 minute intervals?) seems like more than enough, so I suppose the question is: why wasn't it? The problem is resolved on my end, so it's just a matter of whether your team thinks it's worth investigating further.

Cheers.

@humitos
Copy link
Member

humitos commented Mar 7, 2022

I'm going to close this issue because it is hard to make assumptions about what happened since we had an issue in our AWS infrastructure that was not related to this issue. I don't expect this issue to happen under normal circumstances. However, please let us know/reopen if you experience it again.

@humitos humitos closed this as completed Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needed: replication Bug replication is required
Projects
None yet
Development

No branches or pull requests

2 participants