Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CZID-8857] Add ConsensusGenomes to seed script #122

Merged
merged 7 commits into from
Nov 17, 2023
Merged

[CZID-8857] Add ConsensusGenomes to seed script #122

merged 7 commits into from
Nov 17, 2023

Conversation

j-x-han
Copy link
Contributor

@j-x-han j-x-han commented Nov 16, 2023

Ticket: CZID-8857

  • Creates some CG runs in the seed script
  • Tweaked the codegen for types to use the proper field name when getting the relationship from the mapper
  • Added migration that makes all file columns optional
    • When the columns were required, I was getting sqlalchemy.exc.IntegrityError: (psycopg.errors.NotNullViolation) null value in column "file_id" of relation "reference_genome" violates not-null constraint
    • By the same logic as this commit, we probably want to allow files to be nullable (so we can create entity -> upload file -> link file to entity)
  • Updated FileFactory.update_file_ids so all file columns are populated

Example of querying CG data generated from seed.py:
Screenshot 2023-11-16 at 4 30 59 PM

@j-x-han j-x-han changed the title [WIP] [CZID-8857] Add ConsensusGenomes to seed script [CZID-8857] Add ConsensusGenomes to seed script Nov 17, 2023
@j-x-han j-x-han marked this pull request as ready for review November 17, 2023 00:54
@j-x-han j-x-han requested review from robertaboukhalil and jgadling and removed request for robertaboukhalil November 17, 2023 00:54
Comment on lines +71 to +85
# For each file, find the entity associated with it
# and update the file_id for that entity.
files = session.query(File).all()
for file in files:
if file.entity_id:
entity_field_name = file.entity_field_name
entity = session.query(Entity).filter(Entity.id == file.entity_id).first()
if entity:
entity_name = entity.type
session.execute(
sa.text(
f"""UPDATE {entity_name} SET {entity_field_name}_id = file.id
FROM file WHERE {entity_name}.entity_id = file.entity_id""",
)
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely open to suggestions on how to improve this 😓

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Copy link
Contributor

@robertaboukhalil robertaboukhalil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a question about plural_snake_name

Comment on lines +71 to +85
# For each file, find the entity associated with it
# and update the file_id for that entity.
files = session.query(File).all()
for file in files:
if file.entity_id:
entity_field_name = file.entity_field_name
entity = session.query(Entity).filter(Entity.id == file.entity_id).first()
if entity:
entity_name = entity.type
session.execute(
sa.text(
f"""UPDATE {entity_name} SET {entity_field_name}_id = file.id
FROM file WHERE {entity_name}.entity_id = file.entity_id""",
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@@ -80,11 +80,11 @@ async def load_{{ related_field.related_class.snake_name }}_rows(
dataloader = info.context["sqlalchemy_loader"]
mapper = inspect(db.{{ cls.name }})
{%- if related_field.multivalued %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change? Just curious how it still works for plural entities like this one 😅 🪄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just pulls the related_field's name as it's defined in the schema; so for the example you linked, the class is Sample and the related_field we're looking at is sequencing_reads!

I'm making the assumption here that since the field is multivalued, the name defined in the schema is plural.

@j-x-han j-x-han merged commit 7611d8b into main Nov 17, 2023
@j-x-han j-x-han deleted the j-x-han/seed branch November 17, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants