Parallelogram
Converts a parallelogram-shaped alignment visualization to a rectangle. This tool transforms the output from CrossCutAligner so it can be properly visualized by visualizealignment.sh. The transformation shifts coordinates to create a rectangular matrix from the parallelogram pattern.
Basic Usage
parallelogram.sh <input_map> <output_map>
Parallelogram accepts two positional arguments: an input text file containing parallelogram-shaped matrix data and an output file path for the rectangular matrix data.
Parameters
Parameters control the input and output file locations for the coordinate transformation.
Standard Parameters
- input_map
- Input text file containing parallelogram-shaped matrix data. This is typically the output from crosscutaligner.sh.
- output_map
- Output text file with rectangular matrix data. This file can then be processed by visualizealignment.sh to create an image.
Java Parameters
- -Xmx
- Set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory. Default: 200m (fixed allocation).
- -eoom
- This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
- -da
- Disable assertions.
Examples
Complete Workflow
crosscutaligner.sh ATCGATCG GCATGCTA map1.txt
parallelogram.sh map1.txt map2.txt
visualizealignment.sh map2.txt alignment.png
This workflow generates an alignment between two sequences using CrossCutAligner, transforms the parallelogram-shaped output to rectangular format, then creates a PNG visualization.
Basic Transformation
parallelogram.sh parallelogram_data.txt rectangular_data.txt
Transforms a parallelogram-shaped matrix file to rectangular format for visualization.
Algorithm Details
Processing Pipeline
- Input Reading: Read all non-empty lines from the input file into memory
- Dimension Calculation: Determine the number of rows (inputRows) and maximum line width (maxWidth)
- Matrix Initialization: Create an output matrix of size inputRows × maxWidth, filled with spaces
- Coordinate Transformation: For each character at position (i, j) in the input:
- Calculate new row position:
newRow = i - j - Place character at (newRow, j) in output matrix if within bounds
- Calculate new row position:
- Content Detection: Identify rows containing non-whitespace characters
- Output Writing: Write only non-empty rows to the output file
Coordinate Transformation Mathematics
The core transformation uses a simple shift operation: newRow = i - j
This transformation shifts each column upward by an amount equal to its column index, converting the diagonal parallelogram pattern produced by CrossCutAligner into a rectangular alignment matrix suitable for visualization tools.
Memory Requirements
Memory usage is minimal (200MB default) as the tool loads the entire input file into memory, performs the transformation in-place on a character matrix, and writes the result. The memory requirement scales linearly with the size of the alignment matrix (rows × columns).
Use in Alignment Visualization Pipeline
CrossCutAligner produces alignment scoring matrices in a parallelogram shape due to the nature of its diagonal-based alignment algorithm. This parallelogram format is not directly compatible with standard matrix visualization tools, which expect rectangular input. Parallelogram serves as a preprocessing step to reshape the data for downstream visualization by visualizealignment.sh.
Support
Please contact Brian Bushnell at bbushnell@lbl.gov if you encounter any problems.