|
1 |
| -Version 2.18.2 (16-Dec-2013) |
| 1 | +Version 2.19.1 (6-Mar-2014) |
| 2 | +Bug fix to intersect causing BAM footers to be erroneously written when -b is BAM |
| 3 | + |
| 4 | +Speedup for the map tool. |
| 5 | +http://bedtools.readthedocs.org/en/latest/_images/map-speed-comparo.png |
| 6 | + |
| 7 | +Map tool now allows multiple columns and operations in a single run. |
| 8 | +http://bedtools.readthedocs.org/en/latest/content/tools/map.html#multiple-operations-and-columns-at-the-same-time |
| 9 | + |
| 10 | + |
| 11 | +Version 2.19.0 (8-Feb-2014) |
| 12 | +Bug Fixes |
| 13 | +========= |
| 14 | + |
| 15 | +1. Fixed a long standing bug in which the number of base pairs of overlap was incorrectly calculated when using the -wo option with the -split option. Thanks to many for reporting this. |
| 16 | + |
| 17 | +2. Fixed a bug in which certain flavors of unmapped BAM alignments were incorrectly rejected in the latest 2.18.* series. Thanks very much to |
| 18 | +Gabriel Pratt. |
| 19 | + |
| 20 | + |
| 21 | +Enhancements |
| 22 | +============ |
| 23 | + |
| 24 | +1. Substantially reduced memory usage, especially when dealing with unsorted data. Memory usage ballooned in the 2.18.* series owing to default buffer sizes we were using in a custom string class. We have adjusted this and the memory usage has returned to 2.17.* levels while maintaining speed increases. Thanks so much to Ian Sudberry rightfully complaining about this! |
| 25 | + |
| 26 | + |
| 27 | +New features |
| 28 | +============ |
| 29 | + |
| 30 | +1. The latest version of the "map" function is ~3X faster than the one available in version 2.17 and 2.18 |
| 31 | + |
| 32 | +# bedtools 2.17 |
| 33 | +$ time bedtools map \ |
| 34 | + -a hg19.gerp.elements.bed.gz \ |
| 35 | + -b hg19.rmsk.bed.gz \ |
| 36 | + -c 4 \ |
| 37 | + -o collapse > /dev/null |
| 38 | +real 0m15.865s |
| 39 | +user 0m15.815s |
| 40 | +sys 0m0.040s |
| 41 | + |
| 42 | + |
| 43 | +# bedtools 2.19 |
| 44 | +$ time bedtools map \ |
| 45 | + -a hg19.gerp.elements.bed.gz \ |
| 46 | + -b hg19.rmsk.bed.gz \ |
| 47 | + -c 4 \ |
| 48 | + -o collapse > /dev/null |
| 49 | +real 0m5.367s |
| 50 | +user 0m5.314s |
| 51 | +sys 0m0.050s |
| 52 | + |
| 53 | +2. The map function now supports the "-split" option, as well as "absmin" and "absmax" operations. |
| 54 | + |
| 55 | +3. In addition, it supports multiple chromosome sorting criterion by supplying a genome file that defines the expected chromosome order. Here is an example of how to run map with datasets having chromosomes sorted in "version" order, as opposed to the lexicographical chrom order that is the norm. |
| 56 | + |
| 57 | +# version sort the BED files (e.g. chr1, chr2, etc., not chr1, chr10, chr11, etc.) |
| 58 | +$ zcat hg19.gerp.elements.bed.gz | sort -k1,1V -k2,2n > hg19.gerp.versionsorted.bed |
| 59 | +$ zcat hg19.rmsk.bed.gz | sort -k1,1V -k2,2n > hg19.rmsk.versionsorted.bed |
| 60 | + |
| 61 | +# make a toy genome file |
| 62 | +$ cut -f 1 hg19.rmsk.versionsorted.bed | uniq | awk '{print $1"\t"1}' > hg19.versionsorted.genome |
| 63 | + |
| 64 | +$ head hg19.versionsorted.genome |
| 65 | +chr1 1 |
| 66 | +chr1_gl000191_random 1 |
| 67 | +chr1_gl000192_random 1 |
| 68 | +chr2 1 |
| 69 | +chr3 1 |
| 70 | +chr4 1 |
| 71 | +chr4_ctg9_hap1 1 |
| 72 | +chr4_gl000193_random 1 |
| 73 | +chr4_gl000194_random 1 |
| 74 | +chr5 1 |
| 75 | + |
| 76 | +# tell map to expect a different chrom order. |
| 77 | +$ bedtools map \ |
| 78 | + -a hg19.gerp.versionsorted.bed \ |
| 79 | + -b hg19.rmsk.versionsorted.bed \ |
| 80 | + -c 4 \ |
| 81 | + -o collapse \ |
| 82 | + -g hg19.versionsorted.genome |
| 83 | + |
| 84 | + |
| 85 | +Version 2.18.2 (8-Jan-2014) |
2 | 86 |
|
3 | 87 | bedtools. The changes to bedtools reflect fixes to compilation errors, performance enhancements for smaller files, and a bug fix for BAM files that lack a formal header. Our current focus for the 2.19.* release is is on addressing some standing bug/enhancements and also in updating some of the other more widely used tools (e.g., coverage, map, and substract) to use the new API. We will also continue to look into ways to improve performance while hopefully reducing memory usage for algorithms that work with unsorted data (thanks to Ian Sudberry for the ping!).
|
4 | 88 |
|
|
0 commit comments