WobbleAligner

Script: wobblealigner.sh Package: aligner Class: WobbleAligner.java

Aligns a query sequence to a reference using WobbleAligner. The sequences can be any characters, but N is a special case. Outputs the identity, rstart, and rstop positions. Optionally prints a state space exploration map. This map can be fed to visualizealignment.sh to make an image.

Basic Usage

wobblealigner.sh <query> <ref>
wobblealigner.sh <query> <ref> <map>
wobblealigner.sh <query> <ref> <map> <iterations>

WobbleAligner accepts positional arguments for query sequence, reference sequence, optional output file for matrix visualization, and optional iteration count for benchmarking.

Parameters

WobbleAligner uses positional parameters rather than named parameters. All parameters are provided as command line arguments in order.

Required Parameters

query
A literal nucleotide sequence or path to a fasta file containing the query sequence. This sequence will be aligned to the reference.
ref
A literal nucleotide sequence or path to a fasta file containing the reference sequence. The query will be aligned against this reference.

Optional Parameters

map
Optional output text file for matrix score space visualization. This file contains the state space exploration map that can be fed to visualizealignment.sh to create an image representation of the alignment process. Set to "null" for benchmarking runs with no visualization output.
iterations
Optional integer specifying the number of iterations for benchmarking multiple alignment runs. Useful for performance testing and timing measurements.

Examples

Basic Sequence Alignment

wobblealigner.sh ATCGATCGATCG ATCGATCGATCG

Aligns two identical sequences directly from command line. Returns perfect identity (1.0) with alignment coordinates.

File-based Alignment

wobblealigner.sh query.fasta reference.fasta

Aligns sequences from fasta files. The tool will read the first sequence from each file for alignment.

Alignment with Visualization

wobblealigner.sh query.fasta reference.fasta alignment_matrix.txt

Performs alignment and outputs the state space matrix to a text file for visualization. The matrix file can be processed with visualizealignment.sh.

Benchmarking Mode

wobblealigner.sh ATCGATCG ATCGATCG null 1000

Runs alignment 1000 times for performance benchmarking without generating visualization output.

Sequences with N Characters

wobblealigner.sh ATCNGATCG ATCAGATCG matrix.txt

Aligns sequences containing N characters, which receive special scoring treatment (neutral score of 0).

Algorithm Details

WobbleAligner Architecture

WobbleAligner is a traceback-free sequence alignment algorithm using 2-array dynamic programming with 64-bit bit-packed encoding. It implements RingBuffer-based adaptive banding and mathematical constraint solving for exact operation count recovery:

Memory Design

The algorithm uses only 2 arrays instead of a full dynamic programming matrix, avoiding traceback while still calculating exact alignment coordinates (rstart and rstop). This approach significantly reduces memory requirements compared to traditional alignment algorithms.

Adaptive Banding Strategy

WobbleAligner employs a RingBuffer-based adaptive banding mechanism with decideBandwidth() method that:

Scoring System

The algorithm uses fixed scoring constants with MATCH=1L<<SCORE_SHIFT, SUB=-1L<<SCORE_SHIFT, INS=-1L<<SCORE_SHIFT, DEL=-1L<<SCORE_SHIFT, N_SCORE=0L:

64-bit Bit-Packed Encoding

WobbleAligner uses 64-bit bit-packing with POSITION_BITS=21, DEL_BITS=21, SCORE_SHIFT=42 to store multiple pieces of information in a single long integer:

Performance Characteristics

The algorithm has the following performance characteristics:

RingBuffer Implementation

The algorithm uses RingBuffer with configurable ringSize=bandWidth0 to track recent alignment scores. scoreBonus calculation uses RING_MULT*min(ringSize, recentMissingScore) for bandwidth expansion when alignment quality degrades.

Sequence Length Limitations

WobbleAligner supports sequences up to 2 Megabases (2,097,151 bp) due to the 21-bit position encoding with POSITION_MASK=(1L << POSITION_BITS)-1. Length validation uses assert(ref.length<=POSITION_MASK).

Identity Calculation

The algorithm solves a system of equations to calculate exact operation counts:

This provides accurate identity calculation without requiring traceback through the alignment matrix.

Output Format

WobbleAligner outputs alignment results in a concise format:

Use Cases

Technical Notes

Support

For questions and support: