Parallelogram

Script: parallelogram.sh Package: aligner Class: Parallelogram.java

Converts a parallelogram-shaped alignment visualization to a rectangle. This tool transforms the output from CrossCutAligner so it can be properly visualized by visualizealignment.sh. The transformation shifts coordinates to create a rectangular matrix from the parallelogram pattern.

Basic Usage

parallelogram.sh <input_map> <output_map>

Parallelogram accepts two positional arguments: an input text file containing parallelogram-shaped matrix data and an output file path for the rectangular matrix data.

Parameters

Parameters control the input and output file locations for the coordinate transformation.

Standard Parameters

input_map
Input text file containing parallelogram-shaped matrix data. This is typically the output from crosscutaligner.sh.
output_map
Output text file with rectangular matrix data. This file can then be processed by visualizealignment.sh to create an image.

Java Parameters

-Xmx
Set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory. Default: 200m (fixed allocation).
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions.

Examples

Complete Workflow

crosscutaligner.sh ATCGATCG GCATGCTA map1.txt
parallelogram.sh map1.txt map2.txt
visualizealignment.sh map2.txt alignment.png

This workflow generates an alignment between two sequences using CrossCutAligner, transforms the parallelogram-shaped output to rectangular format, then creates a PNG visualization.

Basic Transformation

parallelogram.sh parallelogram_data.txt rectangular_data.txt

Transforms a parallelogram-shaped matrix file to rectangular format for visualization.

Algorithm Details

Processing Pipeline

  1. Input Reading: Read all non-empty lines from the input file into memory
  2. Dimension Calculation: Determine the number of rows (inputRows) and maximum line width (maxWidth)
  3. Matrix Initialization: Create an output matrix of size inputRows × maxWidth, filled with spaces
  4. Coordinate Transformation: For each character at position (i, j) in the input:
    • Calculate new row position: newRow = i - j
    • Place character at (newRow, j) in output matrix if within bounds
  5. Content Detection: Identify rows containing non-whitespace characters
  6. Output Writing: Write only non-empty rows to the output file

Coordinate Transformation Mathematics

The core transformation uses a simple shift operation: newRow = i - j

This transformation shifts each column upward by an amount equal to its column index, converting the diagonal parallelogram pattern produced by CrossCutAligner into a rectangular alignment matrix suitable for visualization tools.

Memory Requirements

Memory usage is minimal (200MB default) as the tool loads the entire input file into memory, performs the transformation in-place on a character matrix, and writes the result. The memory requirement scales linearly with the size of the alignment matrix (rows × columns).

Use in Alignment Visualization Pipeline

CrossCutAligner produces alignment scoring matrices in a parallelogram shape due to the nature of its diagonal-based alignment algorithm. This parallelogram format is not directly compatible with standard matrix visualization tools, which expect rectangular input. Parallelogram serves as a preprocessing step to reshape the data for downstream visualization by visualizealignment.sh.

Support

Please contact Brian Bushnell at bbushnell@lbl.gov if you encounter any problems.