BBSketch
BBSKETCH is an alias for SKETCH. Generates MinHash sketches from genomic sequences using k-mer hashing with configurable processing modes (single, per-taxa, per-sequence, per-IMG). Implements multi-threaded sketch generation via SketchMaker.java with ConcurrentReadInputStream processing. You will be automatically redirected to the SKETCH documentation.
Automatic Redirect
The bbsketch.sh
script is an alias that calls sketch.sh
with the same parameters.
You will be automatically redirected to the SKETCH documentation page in a moment.
If you are not redirected automatically, click here to view the SKETCH documentation.
Usage
# These commands are equivalent:
bbsketch.sh in=sequences.fasta out=sketches.sk
sketch.sh in=sequences.fasta out=sketches.sk
About MinHash Sketching
MinHash sketching creates compact representations of genomic sequences using k-mer hashing with Long.MAX_VALUE-(raw hashcode) encoding. The SketchMaker.java implementation provides multi-threaded processing through ConcurrentReadInputStream with configurable thread pools (capped at 14 threads for memory efficiency).
Technical Implementation
- Hash Storage: Uses specialized encoding with sorted hashcodes stored as Long.MAX_VALUE-(raw hashcode)
- Processing Modes: Supports ONE_SKETCH, PER_TAXA, PER_SEQUENCE, and PER_IMG modes via SketchMakerMini
- Multi-threading: Employs ProcessThread workers with HashMap-based sketch collection for PER_TAXA mode
- Memory Management: Uses SketchHeap data structure with flexible set/map storage based on k-mer occurrence counts
- Taxonomic Integration: Integrates TaxTree.parseNodeFromHeader() for taxonomic node assignment with level promotion
Applications
- Genome Similarity: ANI estimation through dual k-mer comparison algorithms
- Taxonomic Classification: Hierarchical classification using TaxNode level filtering
- Contamination Detection: Multi-hit tracking through CompareBuffer hit analysis
- Database Searches: SketchIndex-based lookup with concurrent indexing via IndexThread
For complete parameter documentation and usage examples, see the SKETCH documentation.