SummarizeMerge

Script: summarizemerge.sh Package: driver Class: ProcessSpeed.java

Summarizes the output of GradeMerge for comparing read-merging performance.

Basic Usage

summarizemerge.sh in=<file>

This tool processes GradeMerge output files and extracts key performance metrics into a tab-delimited summary format for easy comparison of different read-merging tools and parameters.

Parameters

SummarizeMerge has a simple parameter set focused on processing GradeMerge output files.

Parameters

in=<file>
A file containing GradeMerge output. This should be the output from running GradeMerge to evaluate the accuracy of read merging tools. The file must contain timing information (real, user, sys), accuracy metrics (correct, incorrect reads), and signal-to-noise ratio data.

Examples

Basic Summary Generation

summarizemerge.sh in=grademerge_results.txt

Processes a GradeMerge output file and generates a tab-delimited summary with timing and accuracy metrics.

Processing Multiple Test Results

# Generate summaries for different parameter sets
summarizemerge.sh in=bbmerge_test1.out > bbmerge_summary1.txt
summarizemerge.sh in=bbmerge_test2.out > bbmerge_summary2.txt
summarizemerge.sh in=flash_test.out > flash_summary.txt

Create individual summary files for different merging tool evaluations to enable side-by-side comparison.

Combined Analysis Pipeline

# Run GradeMerge evaluation and immediately summarize
grademerge.sh ref=reference.fa reads1=r1.fq reads2=r2.fq merged=merged.fq > evaluation.out
summarizemerge.sh in=evaluation.out > summary.txt

Complete pipeline from evaluation to summary generation for read merging performance assessment.

Output Format

SummarizeMerge generates tab-delimited output with the following columns:

Example Output

#real	user	sys	correct	incorrect	SNR
12.450	11.230	0.890	99.72	0.28	25.539
8.760	7.980	0.650	98.45	1.55	18.234

Header line followed by data rows showing performance metrics for each evaluated condition.

Algorithm Details

SummarizeMerge implements ProcessSpeed.main() method with TextFile-based line-by-line parsing for GradeMerge output format:

Input Processing Strategy

The implementation uses String.startsWith() method for prefix-based line classification:

Time Conversion Algorithm

The toSeconds() method implements string parsing for shell timing format conversion:

Data Flow Architecture

Processing follows TextFile.nextLine() streaming pattern in ProcessSpeed.main():

Memory Efficiency

The implementation uses minimal memory allocation strategies:

Parsing Implementation Details

Use Cases

Read Merging Tool Comparison

Primary use case for comparing different read merging tools:

Pipeline Benchmarking

Integration into automated testing workflows:

Research Applications

Supporting bioinformatics research activities:

Technical Notes

Input Requirements

Performance Characteristics

Limitations

Support

For questions and support: