-
Notifications
You must be signed in to change notification settings - Fork 31
BEDTools
BEDTools is a suite of open-source utilities for analyzing genomic sequence and coverage from various file types, including BED, SAM and BAM files. Although AltAnalyze can now directly process BAM files to produce BED files, the use of BEDTools may be more efficient when processing dozens to hundreds of BAM files. BEDTools can be easily compiled on Unix, Linux and Mac OS X operating systems.
See the BEDTools documentation for more information.
After installation of BEDTools, AltAnalyze users will need to call the utility bamToBed (recognized on Unix systems once BEDTools has been added to the local or global .bashrc file). The file accepted_hits.bam is produced with each TopHat run in the same output directory as the junction BED file.
In the below example, "hESC_differentiation_exons.bed" is produced by AltAnalyze prior to running BEDTools (see instructions here), containing all known mRNA exon region coordinates from Ensembl/UCSC and all novel exon coordinates indicated from the TopHat junction BED results. These methods should work equivalently for non-TopHat produced BAM files, however, additional sorting of the BAM file may be required (e.g., SAMTools).
Build Exon BED file from BAM
For a Single BAM File
bamToBed -i accepted_hits.bam -split| coverageBed -a stdin
-b /home/user/BAMtoBED/Hs_cancer_exons.bed >
/home/user/RNASeqStudy/Sample1/day0_s1__exons.bed
For Many BAM Files (one per folder)
for f in */accepted_hits.bam;
do parentdir=`dirname $f`;
parentdirname=`basename $parentdir`;
bamToBed -i $f -split| coverageBed -a stdin
-b Hs_cancer_exons.bed > ${parentdirname}__exon.bed;
done
This will loop through every folder in the current directory, find the accepted_hits.bam file and name the resulting exon.bed file as "folder_name"exon.bed. Thus, you can obtain all exon.bed files with this single command.