Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement index returns filenames instead of read_ids for format_chromap #67

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

detrout
Copy link
Contributor

@detrout detrout commented Mar 21, 2025

Adds some tests to make test format_kallisto and format_chromap to make sure it returns expected output, and then implements returning the filenames in the format_chromap function instead of read_ids.

This might not be the most elegant solution, but it did work.

fixes #64

Ideally this should be merged after the unit test fix pull request, but it's a separate pull request because it's logically different.

mingjiecn and others added 14 commits November 1, 2024 15:42
…terlab#58)

* update seqspec check

* add spec parameter back to check function
* support gzipped yaml file for function load_spec

* fix bug in function run_check

* support gzipped yaml file for function load_spec
We have some read_ids that are just accession and some that are
accession.fastq.gz and this causes confusion for generating the tool
command lines.

I thought I should save some example seqspecs and test that index
generates the correct commands in with either type of read id.

Also to get this to work I decided to use importlib.resources which
required making my test/data directory look like a python module.
Loading a 0.2 yaml will work but some of the attributes will be
missing, and this causes all sorts of problems for the seqspec display
code
I can then use it to look up additional information such as the
filenames.

This makes it much easier to resolve
pachterlab#64
Also as a convenience I added a get_filenames to the Read object.

The original code operated on finding read_ids with the commit that
added passing around the parsed seqspec object I could then take the
read_ids and look for the fastq filenames.

It's untested but in theory if there's multiple File objects you'll
get a comma separated list of the filenames for the command line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

the generated chromap tool command line from seqspec index should probably use filenames instead of read ids
3 participants