BandedAligner

Script: bandedaligner.sh Package: aligner Class: BandedAligner.java

Aligns a query sequence to a reference using BandedAligner. The sequences can be any characters, but N is a special case. Outputs the identity, rstart, and rstop positions. Optionally prints a state space exploration map. This map can be fed to visualizealignment.sh to make an image.

Basic Usage

bandedaligner.sh <query> <ref>
bandedaligner.sh <query> <ref> <map>
bandedaligner.sh <query> <ref> <map> <iterations> <simd>

Parameters

Parameters control the alignment process, including input/output settings, visualization options, and performance optimizations.

Required Parameters

query
A literal nucleotide sequence or fasta file. The query sequence to be aligned to the reference.
ref
A literal nucleotide sequence or fasta file. The reference sequence against which the query will be aligned.

Optional Parameters

map
Optional output text file for matrix score space visualization. Set to null for benchmarking with no visualization. This map can be fed to visualizealignment.sh to make an image showing the alignment process.
iterations
Optional integer for benchmarking multiple iterations. Useful for performance testing and timing comparisons.
simd
Enable SIMD (Single Instruction, Multiple Data) mode. Needs a large band to be effective. Improves performance for longer sequences when bandwidth is sufficiently large.

Examples

Basic Alignment

bandedaligner.sh ACGTACGTACGT ACGTACCTACGT

Aligns two literal sequences and outputs identity score and alignment coordinates.

File-based Alignment

bandedaligner.sh query.fasta reference.fasta

Aligns sequences from FASTA files.

Alignment with Visualization

bandedaligner.sh query.fasta reference.fasta alignment_map.txt

Performs alignment and outputs a visualization map that can be processed with visualizealignment.sh.

Performance Benchmarking

bandedaligner.sh ACGTACGTACGT ACGTACCTACGT null 1000 simd

Runs 1000 iterations with SIMD optimization enabled for benchmarking purposes.

Algorithm Details

Core Algorithm

BandedAligner implements a space-efficient banded alignment algorithm that restricts computations to a diagonal band around the expected alignment path. This approach significantly reduces memory usage and computational complexity while maintaining accuracy for high-identity alignments.

Technical Implementation

Bandwidth Calculation

The algorithm automatically determines optimal bandwidth using the formula: min(60 + sqrt(refLength), 4 + max(queryLength, refLength)/8). For high-identity alignments with few indels, this adaptive approach ensures minimal bandwidth while maintaining accuracy.

Scoring System

Performance Characteristics

Output Format

The aligner outputs:

Bit-field Encoding

The algorithm uses efficient bit-field encoding in 64-bit integers:

Support

For questions and support: