Implement index returns filenames instead of read_ids for format_chromap #67

detrout · 2025-03-21T22:29:05Z

Adds some tests to make test format_kallisto and format_chromap to make sure it returns expected output, and then implements returning the filenames in the format_chromap function instead of read_ids.

This might not be the most elegant solution, but it did work.

fixes #64

Ideally this should be merged after the unit test fix pull request, but it's a separate pull request because it's logically different.

…lab#53)

…terlab#58) * update seqspec check * add spec parameter back to check function

* support gzipped yaml file for function load_spec * fix bug in function run_check * support gzipped yaml file for function load_spec

We have some read_ids that are just accession and some that are accession.fastq.gz and this causes confusion for generating the tool command lines. I thought I should save some example seqspecs and test that index generates the correct commands in with either type of read id. Also to get this to work I decided to use importlib.resources which required making my test/data directory look like a python module.

Loading a 0.2 yaml will work but some of the attributes will be missing, and this causes all sorts of problems for the seqspec display code

I can then use it to look up additional information such as the filenames. This makes it much easier to resolve pachterlab#64

Also as a convenience I added a get_filenames to the Read object. The original code operated on finding read_ids with the commit that added passing around the parsed seqspec object I could then take the read_ids and look for the fastq filenames. It's untested but in theory if there's multiple File objects you'll get a comma separated list of the filenames for the command line.

mingjiecn and others added 14 commits November 1, 2024 15:42

update schema (pachterlab#52)

2e19173

update file_exsits function to check file url in igvf portal (pachter…

289e7e3

…lab#53)

adding seqspec spec tokenization

2a5df33

allow https for remote onlist (pachterlab#54)

e3a6dea

Update seqspec check so we can run it directly in python script (pach…

8e9554f

…terlab#58) * update seqspec check * add spec parameter back to check function

added python usage to docs

a280567

support gzipped yaml file for function load_spec (pachterlab#60)

c9520b4

* support gzipped yaml file for function load_spec * fix bug in function run_check * support gzipped yaml file for function load_spec

enabled skipping checks with seqspec check

1ea7239

updating seqspec-html to print read info

1e1abed

Return the generated string for testing

7a82f09

Update the test atac seqspecs to 0.3.0

171e16c

Loading a 0.2 yaml will work but some of the attributes will be missing, and this causes all sorts of problems for the seqspec display code

Pass the spec object to the index format functions

aee7082

I can then use it to look up additional information such as the filenames. This makes it much easier to resolve pachterlab#64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement index returns filenames instead of read_ids for format_chromap #67

Implement index returns filenames instead of read_ids for format_chromap #67

detrout commented Mar 21, 2025

Implement index returns filenames instead of read_ids for format_chromap #67

Are you sure you want to change the base?

Implement index returns filenames instead of read_ids for format_chromap #67

Conversation

detrout commented Mar 21, 2025