DriftingPlusAligner

Script: driftingplusaligner.sh Package: aligner Class: DriftingPlusAligner2.java

Aligns a query sequence to a reference using DriftingPlusAligner2. The sequences can be any characters, but N is a special case. Outputs the identity, rstart, and rstop positions. Optionally prints a state space exploration map. This map can be fed to visualizealignment.sh to make an image.

Basic Usage

driftingplusaligner.sh <query> <ref>
driftingplusaligner.sh <query> <ref> <map>
driftingplusaligner.sh <query> <ref> <map> <iterations>

DriftingPlusAligner2 is a highly optimized pairwise sequence aligner designed for high-identity alignments. It can accept sequences as literal nucleotide strings or as fasta files.

Parameters

Parameters are specified as positional arguments rather than named parameters, following the pattern shown in the usage section.

Parameters

query
A literal nucleotide sequence or fasta file. This is the sequence to align against the reference. Can contain any characters, though N nucleotides receive special handling in scoring.
ref
A literal nucleotide sequence or fasta file. This serves as the reference sequence for alignment. The aligner may internally swap query and reference if the query is longer than the reference to optimize performance.
map
Optional output text file for matrix score space visualization. This file contains the state space exploration data that can be fed to visualizealignment.sh to generate alignment visualization images. Set to "null" for benchmarking scenarios where visualization output is not needed.
iterations
Optional integer for benchmarking multiple alignment iterations. Useful for performance testing and timing measurements of the alignment algorithm.
simd
Add this flag to enable SIMD (Single Instruction, Multiple Data) mode for vectorized alignment operations. This can significantly accelerate alignment computation on supported hardware platforms.

Examples

Basic Alignment

driftingplusaligner.sh ATCGATCG ATCGATCGATCG

Aligns a short query sequence against a longer reference sequence, reporting identity and alignment coordinates.

File-Based Alignment

driftingplusaligner.sh query.fasta reference.fasta

Aligns sequences from fasta files. Both single-sequence and multi-sequence fasta files are supported.

Alignment with Visualization

driftingplusaligner.sh query.fasta reference.fasta alignment_map.txt

Performs alignment and saves the state space exploration matrix to a file for visualization with visualizealignment.sh.

SIMD-Accelerated Alignment

driftingplusaligner.sh ATCGATCG ATCGATCGATCG null 1 simd

Uses SIMD vectorization for faster alignment computation. Note that map is set to "null" and iterations to "1" to reach the simd flag.

Benchmarking

driftingplusaligner.sh query.fasta reference.fasta null 1000

Runs alignment 1000 times for performance benchmarking, with visualization disabled for accurate timing measurements.

Algorithm Details

DriftingPlusAligner2 implements an advanced banded dynamic programming algorithm optimized for high-identity sequence alignments. The algorithm incorporates several key innovations:

Adaptive Banding Strategy

The aligner uses a sophisticated adaptive banding approach that balances speed and accuracy:

Memory Optimization

The algorithm achieves exceptional memory efficiency through several design choices:

Scoring System

The alignment scoring incorporates nucleotide-specific handling:

SIMD Acceleration

When SIMD mode is enabled, the algorithm leverages vectorized operations:

Position Calculation

The algorithm calculates alignment coordinates using a mathematical approach:

Performance Characteristics

Output Format

DriftingPlusAligner2 outputs alignment results in a simple format:

Support

For questions and support: