-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with EDAM ontologies file extension #3511
Comments
It's worth noting that even though we removed many false positives (e.g., |
Changing retrieval of file extension from EDAM - Referred at: #3511
Thanks, merged upstream in EDAM. Please post more if needed. Just one warning: the EDAM.tsv is not (at all) up to date with the EDAM.owl, as we're currently lacking that part of the process. Please let us know in case that this is a showstopper for you, and we can consider prioritising it. Otherwise please let us know if/when you need the http://edamontology.org/EDAM.owl updated with these file extensions (it's not a CD after each merge to main) |
Thanks @matuskalas ! The tool used during the linting process provides additions of EDAM links to the stated file formats in the module specifications (detail: https://nf-co.re/blog/2025/modules-ontology). Contributors need to review the additions anyway, but since many of them might not be familiar with EDAM, any initial automation always helps. So, nothing critical, but having a more up-to-date TSV will be beneficial and time-saving. Thanks for the prompt response, and let us know if anything we could help as well. |
Description of the bug
I am checking EDAM ontologies automatic assignment loading process here:
tools/nf_core/modules/modules_utils.py
Line 104 in 3ee0bea
Now, extension is retrieved and processed from the second [1] column of EDAM.tsv. I would rather use an intended field for that, the 15th [14]. With the present assignment, there are some file extensions that are not used in real life (e.g., such as
uniprotkb
). As a drawback, there may be extensions appearing several times, e.g., json:In these cases, I would favour assigning the most generic one, that I would hope is the first one is first created in the dictionary (so no overwriting would be allowed).
This will lead to less automatic assignation of filetypes, but more precise ones. Users would need to be recommended to curate manually when more specific EDAM mapping might exist for some specific file extensions such as JSON (e.g., http://edamontology.org/format_3970)
Command used and terminal output
System information
No response
The text was updated successfully, but these errors were encountered: