Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a concept_id column to drug and gene claims #573

Open
jsstevenson opened this issue Feb 26, 2025 · 1 comment
Open

Add a concept_id column to drug and gene claims #573

jsstevenson opened this issue Feb 26, 2025 · 1 comment
Assignees
Labels
backend Changes to the backend only

Comments

@jsstevenson
Copy link
Contributor

Some ingested resources provide concept IDs instead of, or in addition to, drug and gene names. We are relatively inconsistent about how we handle this, e.g. whether we make the label or the concept ID an alias.

I think it'd be preferable to parse out the concept ID when it's available. This would help grouping (better to search by concept ID first). If a claim doesn't provide a label otherwise, we could just copy the ID over to the label as well.

This becomes relevant in cases like this: #567. Here, an NCI row simply provides the drug name "ADM" and NCIt concept code "C1326". We store the name as the name of the claim and the concept code as an xref. This becomes problematic during normalization because "ADM" is ambiguous (acellular dermal matrix vs adriamycin).

Other possibilities:

  • Always make the concept ID the label if available. This would obviate the need for a schema change but I think it's less ideal if we want to make UI views for claims (which I think we should, this was a feature in the original UI if I'm not mistaken).
@jsstevenson jsstevenson added the backend Changes to the backend only label Feb 26, 2025
@mcannon068nw
Copy link
Contributor

As mentioned from our discussion, I think this is a good idea but we should probably do a test run of this just to see how it impacts grouping. Also, we should consider another name for this column besides concept_id so as to avoid confusion/differentiate it a little more. Maybe source_concept_id?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Changes to the backend only
Projects
None yet
Development

No branches or pull requests

2 participants