BandedPlusAligner
Aligns a query sequence to a reference using BandedPlusAligner2 traceback-free algorithm with mathematical constraint solving for exact Average Nucleotide Identity calculation. Sequences undergo long array encoding with N-base thresholds, output includes identity value and reference coordinates. Optional matrix visualization through state space exploration maps compatible with visualizealignment.sh.
Basic Usage
bandedplusaligner.sh <query> <ref>
bandedplusaligner.sh <query> <ref> <map>
bandedplusaligner.sh <query> <ref> <map> <iterations> <simd>
The tool accepts sequences directly as command-line arguments or as paths to FASTA files. For simple alignment identity calculation, only query and reference are required. Additional parameters enable visualization output, benchmarking, and SIMD acceleration.
Parameters
Parameters are provided as positional arguments in the order listed below. The tool uses a streamlined interface focused on essential alignment parameters.
Core Parameters
- query
- A literal nucleotide sequence (e.g., "ATCGATCG") or path to a FASTA file containing the query sequence. The query sequence will be aligned against the reference to determine identity and alignment coordinates.
- ref
- A literal nucleotide sequence (e.g., "ATCGATCG") or path to a FASTA file containing the reference sequence. This serves as the target for alignment. For optimal performance when position vectors are not needed, the shorter sequence should be provided as the query.
- map
- Optional output text file path for the alignment matrix score space visualization. This creates a detailed map of the dynamic programming matrix that can be fed to visualizealignment.sh to generate alignment images. Set to "null" when benchmarking to disable visualization output and maximize performance.
- iterations
- Optional positive integer specifying the number of alignment iterations to perform for benchmarking purposes. When specified, the alignment will be repeated the given number of times to measure average performance. Useful for timing and performance analysis.
- simd
- Enable SIMD vectorized alignment mode by setting
Shared.SIMD=true
. Activatesshared.SIMDAlign.alignBandVectorDel()
processing instead of scalar alignment loops. Requires wide alignment bands for effective hardware utilization and provides performance improvement for sequences with extensive indel patterns.
Examples
Basic Sequence Alignment
bandedplusaligner.sh "ATCGATCGATCG" "ATCGATCGATCG"
Aligns two identical sequences directly, returning perfect identity (1.0) with alignment coordinates.
FASTA File Alignment
bandedplusaligner.sh query.fasta reference.fasta
Aligns sequences from FASTA files. Each file should contain a single sequence for alignment.
Alignment with Visualization
bandedplusaligner.sh query.fasta ref.fasta alignment_matrix.txt
Performs alignment and outputs the dynamic programming matrix to alignment_matrix.txt for visualization with visualizealignment.sh.
Performance Benchmarking
bandedplusaligner.sh "ATCGATCG" "ATCGATCG" null 1000
Runs the alignment 1000 times with visualization disabled for accurate performance timing.
SIMD Acceleration
bandedplusaligner.sh long_query.fasta long_ref.fasta matrix.txt 1 true
Enables SIMD vectorization for accelerated alignment of long sequences that require wide alignment bands.
Algorithm Details
BandedPlusAligner2 implements traceback-free banded dynamic programming alignment using mathematical constraint solving for exact Average Nucleotide Identity calculation. The implementation uses dual rolling arrays and 64-bit bit-packed scoring to achieve O(n) space complexity.
Dynamic Bandwidth Calculation
The decideBandwidth()
method (lines 58-64) calculates optimal band width by analyzing substitution patterns in sequence prefixes. The formula Math.min(60+(int)Math.sqrt(rLen), 4+Math.max(qLen, rLen)/8)
provides initial bandwidth, then early mismatch detection counts substitutions using (query[i]!=ref[i] ? 1 : 0)
to return Math.min(subs+1, bandwidth)
for high-identity optimization.
Dual Rolling Array Implementation
Uses prev[rLen+1]
and curr[rLen+1]
arrays (line 96) with array swapping temp=prev; prev=curr; curr=temp
(lines 163-165) achieving O(n) space complexity instead of traditional O(mn). The BAD=Long.MIN_VALUE/2
sentinel value prevents out-of-band access.
64-Bit Bit-Packed Encoding
Each scoring cell packs three values into a single 64-bit long using bit field constants:
- POSITION_BITS=21: Reference start coordinates (bits 0-20, mask
POSITION_MASK=(1L<<21)-1
) - DEL_BITS=21: Deletion event counting (bits 21-41, mask
DEL_MASK=((1L<<21)-1)<<21
) - SCORE_SHIFT=42: Alignment scores (bits 42-63, mask
SCORE_MASK=~(POSITION_MASK|DEL_MASK)
)
Score constants include MATCH=1L<<42
, SUB=(-1L)<<42
, and DEL_INCREMENT=(1L<<21)+DEL
for deletion tracking.
Long Array Sequence Encoding
Sequences are converted using Factory.encodeLong()
with query N-threshold=15 and reference N-threshold=31. This encoding enables efficient bitwise comparisons using (q|r)>=15
for N-base detection (line 129) and q==r
for exact matches (line 128).
SIMD Vectorization Integration
When Shared.SIMD
is enabled, the algorithm calls shared.SIMDAlign.alignBandVectorDel(q, ref, bandStart, bandEnd, prev, curr)
(line 120) for hardware-accelerated alignment processing. Falls back to scalar branchless scoring loop (lines 124-144) with conditional expressions for CPU pipeline optimization.
Mathematical Constraint System
The postprocess()
method (lines 181-236) solves exact operation counts without traceback using three equations:
- M + S + I = qLen (query sequence consumption)
- M + S + D = refAlnLength (reference alignment span)
- Score = M - S - I - D (scoring relationship)
Implementation extracts values: deletions=(int)((maxScore&DEL_MASK)>>POSITION_BITS)
, calculates insertions=Math.max(0, qLen+deletions-refAlnLength)
, solves matches=(rawScore+qLen+deletions)/2f
, and derives final identity.
Sequence Length Optimization
When posVector==null
, sequences are swapped using if(query0.length>ref0.length)
(lines 75-79) ensuring query ≤ reference for memory optimization and reducing complexity from O(nm) to O(min(n,m)*max(n,m)).
Implementation Performance
- Memory Usage: 200MB base allocation (line 51 in shell script) plus linear sequence length scaling
- Space Complexity: O(n) using dual arrays vs O(nm) traditional dynamic programming
- Sequence Limit: 2,097,151 bp enforced by
assert(ref.length<=POSITION_MASK)
(line 84) - Band Processing:
bandStart=Math.max(1, Math.min(i-bandWidth, rLen-bandWidth))
restricts computation to diagonal bands
Output Format
The tool outputs alignment results to standard output in a structured format:
- Identity: Floating-point value from 0.0 to 1.0 representing sequence similarity
- Reference Start: Zero-based starting position of optimal alignment in reference
- Reference Stop: Zero-based ending position of optimal alignment in reference
- Matrix Map: Optional detailed visualization file showing dynamic programming matrix exploration
Technical Notes
N-Base Scoring Implementation
Ambiguous nucleotides are handled using bitwise detection hasN=((q|r)>=15)
(line 129). N-bases are scored using N_SCORE=0L
(line 286) providing neutral scoring, while standard mismatches use SUB=(-1L)<
Local Alignment Mode
The algorithm operates in local mode controlled by GLOBAL=false
(line 292). The postprocess()
method searches for maximum score using for(int j=bandStart; j<=bandEnd; j++)
(lines 188-194) rather than using only the final cell. Position extraction uses originPos=(int)(maxScore&POSITION_MASK)
enabling flexible alignment boundaries.
Visualizer Matrix Output
When the output
parameter is set, a Visualizer
object is created with new Visualizer(output, POSITION_BITS, DEL_BITS)
(line 88). The visualizer calls viz.print(curr, bandStart, bandEnd, rLen)
(line 159) for each processed row, generating matrix files readable by visualizealignment.sh.
Support
For questions and support:
- Email: bbushnell@lbl.gov
- Documentation: bbmap.org