-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
My issue is very similar to #9 (comment)
Parsing BAM file: chr22_alignments.sorted.bam
Identified 182998 introns
Annotated introns file /ei/projects/8/8289c66d-2d56-4706-a307-5a9a3eb3747e/data/Annotations/gencode.v44.annotated_juncs.bed provided
Identified 402454 annotated introns
debug: Tree structure:
debug: |--- jad <= 71.50
debug: | |--- class: 0
debug: |--- jad > 71.50
debug: | |--- is_canonical_motif <= 0.50
debug: | | |--- class: 0
debug: | |--- is_canonical_motif > 0.50
debug: | | |--- class: 0
debug: Decision tree 1 confusion matrix:
debug: [[177013 0]
debug: [ 5985 0]]
Fetching junction sequences from /ei/projects/3/31655266-640a-41d2-8663-59bba38bc3c4/data/data/References/hg38_sequin.fa
Identified 132451 unique donors and 127498 unique acceptors
Scoring donor sequences with LR...
pgrep: /nbi/software/production/bin/core/../..//hpccore/5/x86_64/lib/liblzma.so.5: no version information available (required by /lib64/libsystemd.so.0)
pgrep: /nbi/software/production/bin/core/../..//hpccore/5/x86_64/lib/liblzma.so.5: no version information available (required by /lib64/libsystemd.so.0)
pgrep: /nbi/software/production/bin/core/../..//hpccore/5/x86_64/lib/liblzma.so.5: no version information available (required by /lib64/libsystemd.so.0)
pgrep: /nbi/software/production/bin/core/../..//hpccore/5/x86_64/lib/liblzma.so.5: no version information available (required by /lib64/libsystemd.so.0)
pgrep: /nbi/software/production/bin/core/../..//hpccore/5/x86_64/lib/liblzma.so.5: no version information available (required by /lib64/libsystemd.so.0)
pgrep: /nbi/software/production/bin/core/../..//hpccore/5/x86_64/lib/liblzma.so.5: no version information available (required by /lib64/libsystemd.so.0)
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 436, in _process_worker
r = call_item()
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 288, in __call__
return self.fn(*self.args, **self.kwargs)
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 595, in __call__
return self.func(*args, **kwargs)
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/joblib/parallel.py", line 263, in __call__
for func, args, kwargs in self.items]
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/joblib/parallel.py", line 263, in <listcomp>
for func, args, kwargs in self.items]
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/lib2pass/seqlr.py", line 39, in train_and_predict
lr.fit(X_train, y_train)
File "/ei/software/testing/python_miniconda/4.10.3_py3.9_sk/x86_64/envs/2passtools/lib/python3.6/site-packages/sklearn/linear_model/_logistic.py", line 1376, in fit
" class: %r" % classes_[0])
ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0
"""
I followed the instructions and then ran 2passtools with DEBUG on.
paftools.js gff2bed -j gencode.v44.annotation.gtf > gencode.v44.annotated_juncs.bed
2passtools score -v DEBUG -f /ei/projects/3/31655266-640a-41d2-8663-59bba38bc3c4/data/data/References/hg38_sequin.fa -p 24 \
-a /ei/projects/8/8289c66d-2d56-4706-a307-5a9a3eb3747e/data/Annotations/gencode.v44.annotated_juncs.bed --classifier-type decision_tree \
-m "GTAG|GCAG|ATAG" -j 4 --keep-all-annot -o iPSC.merged.juncs.all.bed $subset_bam
head -n 5 /ei/projects/8/8289c66d-2d56-4706-a307-5a9a3eb3747e/data/Annotations/gencode.v44.annotated_juncs.bed
chr1 12227 12612 ENST00000456328.2|lncRNA|DDX11L2 1000 +
chr1 12721 13220 ENST00000456328.2|lncRNA|DDX11L2 1000 +
chr1 12057 12178 ENST00000450305.2|transcribed_unprocessed_pseudogene|DDX11L1 1000 +
chr1 12227 12612 ENST00000450305.2|transcribed_unprocessed_pseudogene|DDX11L1 1000 +
chr1 12697 12974 ENST00000450305.2|transcribed_unprocessed_pseudogene|DDX11L1 1000 +
Could this be something to do with my canonical motifs? Also my JAD is set to 4 but the tree structure says jad <= 71.50, is this correct?
Kind regards,
Sofia
Metadata
Metadata
Assignees
Labels
No labels