-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Beta dashboard: only the first page of the projects list is accessible #11061
Comments
Hrm, interesting. I wasn't able to reproduce this issue while logged in with my user. I thought it could be because you could have ton of projects, but I found that we have a similar amount. So, I'm discarding that. I know we were facing some performance issues on this particular dashboard page (readthedocs/ext-theme#128), but I'm not sure why it could affect your user but not mine. cc @agjohnson any clue? |
I have actually noticed something similar with a couple of projects and the build list (loading the build list for one project gave a 5xx and for another project with more builds did not). I think there is something in both cases that seems to be causing a timeout on the application side. We should look into exceptions and logging probably. |
This is the sentry issue for the 500 when accessing a user profile https://read-the-docs.sentry.io/issues/4413013558/?project=148442&query=is%3Aunresolved&referrer=issue-stream&statsPeriod=14d&stream_index=2. |
@stsewd FYI Sentry has a feature of sharing tracebacks. That would give you a link letting others see what's on the page. Not all, just a reduced portion without any sensitive data. That's useful for sharing publicly. |
This seems to be a different error since it's a 500, and they are getting a 502. This 500 error seems to be related to readthedocs.org/readthedocs/builds/models.py Lines 725 to 731 in 9f3ea4c
In the templates, we are making the assumption that all the builds are attached to a version, when it's not true due to the |
Hrm yeah, that would explain the error I hit. Maybe I'm not recalling the error and it was a 500 error, not a timeout. That is maybe a little tricky to solve, as the build list UI is relying more on the relationships between build/version -- in filtering, listing by version, etc. I'll have to think more on a fix there. But yeah, that does seem like a different error than the timeout here. I'm guessing NR might have some information on the long transaction. |
We set to null to keep the history of builds, yes. We populate some fields from the version into the build for any other operation we may need readthedocs.org/readthedocs/builds/models.py Lines 942 to 945 in dcd4655
|
I was having trouble tracking down any worthwhile information in New Relic. I think I see some of the transactions, and I see the build list SELECT taking a few seconds, but that's about it. So I configured a build with a null version like @stsewd described and was able to reproduce both a 500 on the user profile, but also what would probably be a timeout on the dashboard project listing (response time was more the 10s). I did not see a long database select time here though. While profiling this locally, I only noticed the slow response time that I've noted with some template errors. I'm not convinced that this would affect production, but I also can't say for sure that it doesn't. |
When versions are removed, the builds remain with a `build.version = None`, along with some additional attributes on the Build instance (for example `build.version_name`). This bug affected any display that listed projects/versions that listed the most recent build. Fixes readthedocs/readthedocs.org#11061
So, I believe I fixed the underlying template bug triggering this behavior in: But I think the server response might be instead this: I say this because I can't see a reason why the PR above should fix a 502 response. I also noticed the same problem with inflated SQL queries: db time is ~1000ms without the PR above and 43ms with the fix. Also also, the offending template code was also in an If the PR above fixes the 502 response, the template bug I notice while working on templates might very well exist in production as well. |
* Use build.version_name first, in case build.version is None When versions are removed, the builds remain with a `build.version = None`, along with some additional attributes on the Build instance (for example `build.version_name`). This bug affected any display that listed projects/versions that listed the most recent build. Fixes readthedocs/readthedocs.org#11061 * Use get_version_name method instead of template logic
@agjohnson has it been deployed? I just checked again and I'm seeing exactly the same bug with HTTP 502. Nothing changed for me. |
@webknjaz It was not yet released, but I just pushed out a somewhat hacky release. The full release will go out on Tues. I can load your profile page now. Are you able to load your dashboard page 2? |
Yes, I can see both profile and dashboard. The pagination also works on both. Thanks! |
Great, I'm glad that fixed the issue! I think that confirms that while the actual bug here was a minor template error, the underlying behavior you noticed with a timeout is actually a bug with the application exception handling or logging. |
Details
Expected Result
I want to see the second page of projects on the beta host.
Actual Result
https://beta.readthedocs.org/dashboard/ works, but clicking on the
[2]
at the bottom that attempts to open https://beta.readthedocs.org/dashboard/?page=2 gets stuck for half a minute, and then, it shows CloudFlare's502: Bad gateway
web page. This happens consistently so I suppose there's, an unhandled exception or a timeout somewhere in the new beta website.The text was updated successfully, but these errors were encountered: