Skip to content

Interpreting results

Bogdan Kirilenko edited this page Jan 12, 2024 · 20 revisions

Everything related to results' processing and interpretation goes here. WIP

TOGA output naming convention

${transcript_ID_in_reference}.${chain_ID_used_for_annotation}

For example, if for a transcript in the reference called ENST00000123456 TOGA identified an orthologous chain with ID 1, the annotated transcript ID in the query will have an ID ENST00000123456.1.

Selected topics

Interpreting gene loss classification

relevant issues

transcript vs gene losses

Plotting inactivating mutations

Potentially worths another page. (https://github.com/hillerlab/TOGA/issues/45)

Interpreting rejection reasons

Sometimes, TOGA can not process a reference transcript due to variety of reasons. For example, the reference transcript has no alignment to the query genome or can be corrupted (reading frame contains not 3N nucleotides, etc)

These cases should be documented in this section

related issues:

term definition in file gene_rejaction_reasons.tsv too long query locus - a technical reason

Interpreting codon.fasta

relevant issues

Query annotation fasta

Everything related to the query annotation which is produced by TOGA.

relevant issue 1

Output genes and isoforms

relevant issue: (https://github.com/hillerlab/TOGA/issues/58)

Filtering bed output

See Michael's reply here

More detailed sections

Using TOGA results as input for other tools

Evidence modeller

  • to be filled

Output fasta for proteomics

Use this issue to fill the section: (https://github.com/hillerlab/TOGA/issues/85)