Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportedFile: use BigAutoField for primary key #9669

Merged
merged 4 commits into from
Jan 30, 2025

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Oct 18, 2022

We could disable search indexing while we do the migration, but I don't think that should be required, we have 11M records, but to migrate the SphinxDomain model it took 15 min, and we had ~56M.

In [7]: ImportedFile.objects.count()
Out[7]: 11527437

So some 3 min of not being able to index new versions doesn't seem bad... There are two things that could happen:

  • The query times out and we don't index that version.
  • The query waits till the migration is done, nothing gets lost.

But if we disable search indexing we definitely
won't index new versions.

We don't use those models outside search indexing, so doc serving and such shouldn't be affected.

ref #9492


📚 Documentation previews 📚

We could disable search indexing while we do the migration,
but I don't think that should be required, we have 11M records,
but to migrate the SphinxDomain model it took 15 min,
and we had ~56M.

```python
In [7]: ImportedFile.objects.count()
Out[7]: 11527437
```

So some 3 min of not being able to index new versions doesn't seem
bad... There are two things that could happen:

- The query times out and we don't index that version.
- The query waits till the migration is done,
  nothing gets lost.

But if we disable search indexing we definitely
won't index new versions.

We don't use those models outside search indexing,
so doc serving and such shouldn't be affected.

ref #9492
@stsewd stsewd requested a review from a team as a code owner October 18, 2022 15:08
@stsewd stsewd requested a review from humitos October 18, 2022 15:08
@humitos
Copy link
Member

humitos commented Aug 22, 2023

@stsewd
Copy link
Member Author

stsewd commented Aug 22, 2023

If everything works as expected, we can wait till we have a smaller table, yeah.

@humitos humitos added the Status: blocked Issue is blocked on another issue label Aug 23, 2023
@stsewd
Copy link
Member Author

stsewd commented Sep 26, 2023

So, we have 12M of records now :D, so we should probably wait till we have run a re-index, that will delete old files.

In [1]: ImportedFile.objects.count()
Out[1]: 12277094

@humitos
Copy link
Member

humitos commented Jan 17, 2024

We have more files now 🙃

In [1]: ImportedFile.objects.count()
Out[1]: 13833598

What should we do here?

@stsewd
Copy link
Member Author

stsewd commented Jan 17, 2024

We have more files now 🙃

In [1]: ImportedFile.objects.count()
Out[1]: 13833598

What should we do here?

We are still tracking index files for new projects, doesn't look like this number will be decreasing on its own (old projects triggering builds), so a manual re-index is needed.

@ericholscher
Copy link
Member

This seems worth doing if we think we should do it :)

@stsewd
Copy link
Member Author

stsewd commented Jan 28, 2025

We are at 36% capacity on IDs for ImportedFile, it should take some years to fill up, but better deal with that now than later when things are on fire.

@stsewd stsewd merged commit f041c62 into main Jan 30, 2025
8 checks passed
@stsewd stsewd deleted the use-bigautofield-importedfile branch January 30, 2025 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: blocked Issue is blocked on another issue
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants