BBTools Guides

Comprehensive documentation and tutorials for BBTools suite

๐Ÿ“š 23 Detailed Guides โœ๏ธ Written by Brian Bushnell ๐Ÿ”— Local Documentation
Showing all 23 guides

๐ŸŽฏ Core Tools Documentation

BBMap Guide

Complete guide to BBMap, the fast splice-aware read aligner. Learn about indexing, performance tuning, memory management, and advanced alignment options for DNA/RNA sequencing data.

BBMap Core Alignment

BBDuk Guide

Master BBDuk for adapter trimming, quality filtering, and contamination removal. Covers kmer-based operations, memory usage, and parameter optimization for high-quality read preprocessing.

BBDuk Quality Control

BBMerge Guide

Learn to merge paired-end reads using BBMerge's neural network approach. Understand overlap detection, false positive reduction, and optimal parameters for different sequencing platforms.

BBMerge Read Processing

Tadpole Guide

Comprehensive tutorial for Tadpole assembler. Learn kmer-based assembly, error correction, read extension, and optimization for microbial genomes and organelles.

Tadpole Assembly

Clumpify Guide

Optimize file compression and processing speed with Clumpify. Learn how to group overlapping reads into clumps for better compression and faster downstream analysis.

Clumpify Optimization

BBNorm Guide

Master kmer-based coverage normalization with BBNorm. Learn to reduce coverage variation for better assemblies and remove sequencing errors based on kmer depth.

BBNorm Normalization

๐Ÿ“Š Quality Control & Analysis

Stats Guide

Generate comprehensive statistics for sequence files. Learn to calculate N50, GC content, length distributions, and quality metrics for sequence analysis and QC.

Stats.sh Statistics

FilterByTile Guide

Filter Illumina reads by flowcell tile quality. Learn to identify and remove reads from problematic tiles to improve overall data quality in Illumina sequencing runs.

FilterByTile Quality Filter

CalcUniqueness Guide

Calculate genome uniqueness and complexity using kmer analysis. Understand repetitive regions, mappability, and genome complexity for better experimental design.

CalcUniqueness Genome Analysis

Preprocessing Guide

Complete preprocessing workflow for sequencing data. Learn best practices for quality trimming, adapter removal, contamination filtering, and data preparation pipelines.

Multiple Tools Workflow

๐Ÿงน Deduplication & Error Correction

Dedupe Guide

Remove duplicate sequences and PCR artifacts with Dedupe. Learn about optical duplicate detection, cluster-based deduplication, and memory-efficient processing.

Dedupe Deduplication

Repair Guide

Fix broken paired-end read files with Repair. Learn to restore proper pairing, convert between paired formats, and handle orphaned reads in sequencing data.

Repair Format Repair

๐Ÿงฌ Variant Calling & Analysis

CallVariants Guide

Comprehensive variant calling with CallVariants. Learn multisample analysis, ploidy handling, indel realignment, and VCF output for genomic variation detection.

CallVariants Variant Analysis

๐Ÿ”ฌ Taxonomy & Classification

Taxonomy Guide

Work with NCBI taxonomy data in BBTools. Learn to download taxonomy files, convert accessions to TaxIDs, and use taxonomic information for sequence classification.

Taxonomy Tools Classification

BBSketch Guide

Rapid taxonomic identification using MinHash sketches. Learn BBSketch and SendSketch for fast species identification and contamination detection in sequencing data.

BBSketch/SendSketch Identification

Seal Guide

Alignment-free quantification with Seal. Learn kmer-based abundance estimation for RNA-seq, metagenomics, and rapid sequence quantification without mapping.

Seal Quantification

๐Ÿ”„ Format Conversion & Manipulation

Reformat Guide

Convert between sequence formats with Reformat. Learn format conversion, interleaving/deinterleaving, compression options, and sequence manipulation operations.

Reformat Format Conversion

๐Ÿ› ๏ธ Specialized Applications

BBMask Guide

Mask low-complexity and repetitive sequences with BBMask. Learn entropy-based masking, tandem repeat detection, and sequence complexity filtering techniques.

BBMask Sequence Masking

AddAdapters Guide

Add synthetic adapter sequences to reads for testing and validation. Learn adapter simulation, contamination testing, and preprocessing pipeline validation.

AddAdapters Testing Tool

SplitNextera Guide

Process Nextera mate-pair libraries with SplitNextera. Learn to handle jumping libraries, transposase sequences, and mate-pair specific preprocessing steps.

SplitNextera Mate-Pair Processing

๐Ÿ“œ Reference & Legacy

Sample Guide Template

Template and example structure for BBTools guides. Useful for understanding guide format and as a reference for creating new documentation.

Template Reference

BBMap Legacy Documentation

Historical BBMap documentation and readme. Contains legacy information and early development notes for BBMap aligner reference.

BBMap Legacy

About BBTools Guides

These comprehensive guides are written by Brian Bushnell, creator of BBTools, and provide detailed documentation for the complete suite of bioinformatics tools. Each guide includes:

  • Detailed parameter explanations - Understand every option and flag
  • Usage examples - Real-world command examples and workflows
  • Best practices - Optimization tips and recommended approaches
  • Memory and performance notes - Resource requirements and tuning advice
  • Common troubleshooting - Solutions to frequently encountered issues

๐Ÿ’ก Tips for Using These Guides

  • Start with the BBDuk Guide for quality control basics
  • Use the BBMap Guide for alignment fundamentals
  • Consult the Preprocessing Guide for complete workflow examples
  • Each tool's individual documentation page also links to its guide
  • Guides are regularly updated - check the "Last updated" date in each file