Consect

Script: consect.sh Package: jgi Class: Consect.java

Generates the conservative consensus of multiple error-correction tools. Corrections will be accepted only if all tools agree. This tool is designed for substitutions only, not indel corrections.

Basic Usage

consect.sh in=<file,file,file,...> out=<file>

Consect requires a minimum of 3 input files: the first file must contain the original uncorrected reads, followed by at least two files containing corrected reads from different error-correction tools. All files must have reads in the same order.

Parameters

Consect accepts standard BBTools parameters and Java memory options. Parameters are organized by their function in the consensus generation process.

Standard Parameters

in=
A comma-delimited list of files; minimum of 3. All files must have reads in the same order. The first file must contain the uncorrected reads. All additional files must contain corrected reads.
out=<file>
Output of consensus reads.
overwrite=f
(ow) Set to false to force the program to abort rather than overwrite an existing file.
cq=f
(changequality) Set to true to update quality scores based on the consensus corrections. When enabled, quality scores are set to the maximum quality from the corrected reads at positions where consensus is achieved.
verbose=f
Set to true to print verbose messages during processing, useful for debugging and monitoring progress.

Java Parameters

-Xmx
This will set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory.
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions.

Examples

Basic Consensus Generation

consect.sh in=original.fq,tadpole_corrected.fq,bless_corrected.fq out=consensus.fq

Generate consensus from original reads and two different error correction tools (Tadpole and BLESS). Only corrections where both tools agree will be applied.

Consensus with Quality Score Updates

consect.sh in=raw.fq,corrected1.fq,corrected2.fq,corrected3.fq out=consensus.fq cq=t

Generate consensus from three correction tools and update quality scores at corrected positions.

Verbose Mode for Monitoring

consect.sh in=original.fq,spades_ec.fq,karect.fq out=consensus.fq verbose=t

Run with verbose output to monitor processing progress and see detailed statistics.

Algorithm Details

Conservative Consensus Strategy

Consect implements a conservative consensus algorithm designed specifically for substitution corrections. The algorithm processes reads position by position, applying corrections only when all error-correction tools agree on the same base change.

Position-by-Position Analysis

For each position in a read, the algorithm:

Quality Score Management

When the cq=t parameter is enabled:

Performance Characteristics

Statistical Reporting

Consect provides comprehensive statistics including:

Design Limitations

Important constraints of the consensus algorithm:

Output Statistics

Consect provides detailed statistics about the consensus process:

Correction Metrics

Read-Level Statistics

Best Practices

Input File Preparation

Tool Selection

Parameter Tuning

Support

For questions and support: