Feature Aggregate Depth Utility
Generate counts of reads that map to non-overlapping portions of genes
- Shaun Adkins ([email protected])
- Matthew Chung ([email protected])
Julia - v0.7 or later (v1.0 or later preferred)
- ArgParse
- BioAlignments - v1.0.0
- GenomicFeatures - v1.0.0
- A single BAM file of reads (sorted by position)
- A GFF3-formatted annotation file
The output is a tab-delimited file whose name is the BAM input file's basename, ending in '.counts.txt'. For example if the name of the BAM file was 'sample123.bam', the output file name is 'sample123.counts.txt'
The output file has the following fields:
- Name of feature
- Number of bases that do not overlap with other features (uniq_len)
- Number of BAM records that aligned to this feature's non-overlapping bases
- Aligned reads not part of a fragment get 0.5 count, and fragments get 1 count
- Fractionalized feature counts
- For each aligned fragment, the depth is (intersect of fragment coords to feature coords that do not overlap with other features) / length of the fragment
- Aligned reads that are not part of a fragment get their depth contribution cut in half
- TPM count in scientific notation
usage: fadu.jl -b /path/to/file.bam -g /path/to/annotation.gff3
-o /path/to/output/dir/ [-s STRANDED] [-f FEATURE_TYPE]
[-a ATTRIBUTE_TYPE] [-p] [-m MAX_FRAGMENT_SIZE]
[-q MIN_MAPPING_QUALITY] [-M] [-C CHUNK_SIZE]
[--version] [-h]
Generate counts of reads that map to non-overlapping portions of genes
optional arguments:
-b, --bam_file /path/to/file.bam
Path to BAM file (SAM is not supported).
-g, --gff3_file /path/to/annotation.gff3
Path to GFF3-formatted annotation file (GTF is
not supported).
-o, --output_dir /path/to/output/dir/
Directory to write the output.
-s, --stranded STRANDED
Indicate if BAM reads are from a
strand-specific assay. Choose between 'yes',
'no', or 'reverse'. (default: "no")
-f, --feature_type FEATURE_TYPE
Which GFF3/GTF feature type (column 3) to
obtain. readcount statistics for.
Case-sensitive. (default: "gene")
-a, --attribute_type ATTRIBUTE_TYPE
Which GFF3/GTF feature type (column 9) to
obtain. readcount statistics for.
Case-sensitive. (default: "ID")
-p, --keep_only_proper_pairs
If enabled, keep only properly paired reads
when performing calculations.
-m, --max_fragment_size MAX_FRAGMENT_SIZE
If the fragment size of properly-paired reads
exceeds this value, process pair as single
reads instead of as a fragment. Setting this
value to 0 will make every fragment pair be
processed as two individual reads. Maximum
value is 65535 (to allow for use of UInt16
type, and fragments typically are not that
large). If --keep_only_proper_pairs is
enabled, then any fragment exceeding this
value will be discarded. (type: Int64,
default: 1000)
-q, --min_mapping_quality MIN_MAPPING_QUALITY
Set the minimum mapping quality score for a
read to be considered. Max value accepted is
255. (type: Int64, default: 10)
-M, --remove_multimapped
If enabled, remove any reads or fragments that
are mapped to multiple regions of the genome,
indiated by the 'NH' attribute being greater
than 1.
-C, --chunk_size CHUNK_SIZE
Number of validated reads to store into memory
before processing overlaps with features.
(type: Int64, default: 10000000)
--version show version information and exit
-h, --help show this help message and exit
When processing the input BAM file, any reads that are not properly paired or are not primary mappings will be thrown out. This is determined by the SAM-format bit-flag.
FADU currently does not support eukaryotic samples, or at least those that come from Tophat or HISAT2 aligners.
A safe way to pass in values to the options is to enclose any string arguments (like file paths or strandedness) in quotations so that it ensures Julia's argument parser reads the parameter as an option value instead of a command-type argument.
Currently at this time, only fragment depth is supported. Performing calculations using read depth may be implemented in the future though.
If you have a question or suggestion about FADU, feel free to e-mail either of the authors above, or create an issue here