Skip to content

migration ignores documents already in DB #445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 8, 2025

Conversation

GracefulLemming
Copy link
Contributor

Modifies fetch_sheet to skip documents that already exist in the database. To determine if a document exists, this function uses the query document_id_from_name to compare the short name from a metadata sheet with all short names in the database. If there is a match, it skips the rest of the spreadsheet. Otherwise, it adds the sheet as normal.

This is an imprecise solution meant to facilitate the creation of a new edited collection in the short-term. One major limitation of this solution is its inability to add new words or annotation layers to an existing document. In the future, we will likely want the possibility for more granular comparisons to allow merging of new info in sheets with existing info in the database

Copy link

netlify bot commented Feb 27, 2025

Deploy Preview for dailp canceled.

Name Link
🔨 Latest commit 23f92a0
🔍 Latest deploy log https://app.netlify.com/sites/dailp/deploys/67c0f2912a36550008cbb060

Copy link
Collaborator

@nole2701 nole2701 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I was wondering though why do you need the 1 second wait on line 212. And when it comes to updating documents that exist but have changed, is it not viable to just overwrite them?

@GracefulLemming
Copy link
Contributor Author

Looks good to me. I was wondering though why do you need the 1 second wait on line 212. And when it comes to updating documents that exist but have changed, is it not viable to just overwrite them?

@nole2701 Im actually not sure why that wait is there, although I suspect it has to do with Google rate limiting. I will look into this more later!

As for updating existing documents, it is not viable to simply overwrite them entirely.
Consider this example AnnotatedForm in a document:

Source Romanized Translation Comment
ᎤᎨᏩᎴᏓᏃ ugewaledano personal name name of Sequoyah's brother

If we simply rewrote the document, it would delete the word. Then, migration would recreate the word:

Source Romanized Translation Comment
ᎤᎨᏩᎴᏓᏃ ugewaledano personal name

Notice that comments do not persist under this model. Neither would attached audio or any changes to syllabary, phonetics, etc. made in the writing environment.

One possible solution is just replace the source, romanization, and translation fields from the spreadsheet when migration runs. However, this would still overwrite contributions in the writing environment which is less than ideal.

Copy link

netlify bot commented May 8, 2025

Deploy Preview for dailp canceled.

Name Link
🔨 Latest commit ee33677
🔍 Latest deploy log https://app.netlify.com/sites/dailp/deploys/681d11e6d8c3590008d096db

@GracefulLemming GracefulLemming merged commit d683430 into main May 8, 2025
5 checks passed
@GracefulLemming GracefulLemming deleted the migration-skips-existant-documents branch May 8, 2025 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants