BBMap & BBTools

The OFFICIAL Suite of Bioinformatics Tools

Fast, multithreaded tools for DNA and RNA sequence analysis

Created and maintained by Brian Bushnell

Official Repository: This is the authoritative source for BBTools.

Core Tools

BBMap

Fast short read aligner optimized for variant calling from Illumina data. Handles reads up to 6kbp.

bbmap.sh ref=genome.fa in=reads.fq out=mapped.sam

BBDuk

Adapter trimming, quality filtering, and contaminant removal. The Swiss Army knife of read processing.

bbduk.sh in=reads.fq out=clean.fq ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo

BBMerge

Paired read merging using neural networks for very low false positive rates.

bbmerge.sh in1=r1.fq in2=r2.fq out=merged.fq outu=unmerged.fq

Tadpole

Fast error correction and assembly for small genomes, amplicons, and metagenomes.

tadpole.sh in=reads.fq out=corrected.fq mode=correct

Clumpify

Reorders reads to reduce file size by 30% on average. Also finds and removes duplicates without alignment.

clumpify.sh in=reads.fq out=clumped.fq.gz

BBNorm

Kmer-based normalization to reduce coverage variation for better assemblies.

bbnorm.sh in=reads.fq out=normalized.fq target=100

BBCMS

Error correction and low-depth removal for metagenomes with no memory constraints.

bbcms.sh in=reads.fq out=corrected.fq

SendSketch/BBSketch

Rapid taxonomic identification by comparing MinHash sketches to remote databases.

sendsketch.sh in=unknown.fa

Seal

High-speed alignment-free sequence quantification for RNA-seq using kmer matching.

seal.sh in=reads.fq ref=transcripts.fa rpkm=rpkm.txt

QuickBin

Bins metagenomic contigs using coverage and kmer frequencies for metagenome analysis.

quickbin.sh in=contigs.fa out=bins *.sam

QuickClade

Fast taxonomic assignment using kmer frequency spectra, optimized for metagenomic bins.

quickclade.sh bins out=taxonomy.tsv

NovaDemux

Statistical demultiplexing with error correction for optimal yield and minimal crosstalk.

novademux.sh in=reads.fq out=out_%.fq expected=barcodes.txt

CallVariants

Variant calling from SAM/BAM files with multisample VCF support.

callvariants.sh in=mapped.sam ref=genome.fa vcf=vars.vcf

CallGenes

Gene prediction for prokaryotes, viruses, and mitochondria. Also finds 16S/18S/23S/5S/tRNAs.

callgenes.sh in=contigs.fa out=genes.gff outa=aminos.faa

SortByName

Sort reads by name, length, quality, position, taxonomy, or other keys. Handles any file size.

sortbyname.sh in=reads.fq out=sorted.fq

Why BBTools?

🚀 Performance

Multithreaded, memory-efficient algorithms with SIMD support

🔧 Comprehensive

Over 90 tools covering alignment, assembly, QC, and analysis

📊 Production Ready

BBMap created by Brian Bushnell; suite expanded at JGI. Cited in thousands of publications

🔄 Format Flexible

Handles FASTQ, FASTA, SAM/BAM, and compressed files seamlessly

Documentation

Download BBTools

GitHub

Latest version repository

Download from GitHub

SourceForge

Latest version tar.gz

Download from SourceForge

Current Version: 39.34

License: Free for unlimited use

Citation

If you use BBTools in your research, please cite:

Bushnell, B. (2014) BBMap: A Fast, Accurate, Splice-Aware Aligner. 
Lawrence Berkeley National Laboratory. LBNL-7065E.

BBTools software package: https://bbmap.org
GitHub: https://github.com/bbushnell/BBTools

For BBMerge specifically:

Bushnell B, Rood J, Singer E (2017) BBMerge – Accurate paired 
shotgun read merging via overlap. PLoS ONE 12(10): e0185056.