TileDump

Script: tiledump.sh Package: hiseq Class: TileDump.java

Processes a tile dump from FilterByTile. This tool can modify tile dimensions, apply quality filters, and write processed tile data for downstream analysis.

Basic Usage

tiledump.sh in=<input file> out=<output file>

Reads a tile dump file, processes it according to the specified parameters, and writes the result to an output file.

Parameters

Parameters are organized by their function in the tile processing workflow.

Standard parameters

in=<file>
Input dump file containing tile data to be processed.
out=<file>
Output dump file where processed tile data will be written.
overwrite=t
(ow) Set to false to force the program to abort rather than overwrite an existing file.

Processing parameters

x=-1
Widen tiles to at least this X width. Default -1 (no widening).
y=-1
Widen tiles to at least this Y width. Default -1 (no widening).
reads=-1
Widen tiles to at least this average number of reads. Default -1 (no widening based on reads).
alignedreads=250
Average aligned reads per tile for error rate calibration. Used for statistical calculations.
verbose=f
Set to true to print verbose output during processing.
blur=f
(blurtiles, smoothtiles) Set to true to blur/smooth tiles during processing.

Quality Threshold Parameters

qdeviations=2.4
(qd) Number of standard deviations for quality thresholds. Default 2.4.
udeviations=1.5
(ud) Number of standard deviations for uniqueness thresholds. Default 1.5.
edeviations=3.0
(ed) Number of standard deviations for error-free thresholds. Default 3.0.
pgdeviations=1.4
(pgd) Number of standard deviations for poly-G thresholds. Default 1.4.

Fraction Threshold Parameters

qfraction=0.08
(qf) Quality fraction threshold for tile filtering. Default 0.08.
ufraction=0.01
(uf) Uniqueness fraction threshold for tile filtering. Default 0.01.
efraction=0.2
(ef) Error-free fraction threshold for tile filtering. Default 0.2.
pgfraction=0.2
(pgf) Poly-G fraction threshold for tile filtering. Default 0.2.

Absolute Threshold Parameters

qabsolute=2.0
(qa) Absolute quality threshold for tile filtering. Default 2.0.
uabsolute=1.0
(ua) Absolute uniqueness threshold for tile filtering. Default 1.0.
eabsolute=6.0
(ea) Absolute error-free threshold for tile filtering. Default 6.0.
pgabsolute=0.2
(pga) Absolute poly-G threshold for tile filtering. Default 0.2.

Filtering Parameters

maxbadfraction=0.4
(mbf, mdf, maxdiscardfraction) Maximum fraction of tiles to discard. Default 0.4.
impliederrorrate=0.012
(inferrederrorrate, ier, maxier) Maximum implied error rate threshold. Default 0.012.
inferredquality=
(impliedquality, miniq) Minimum inferred quality score. Converted to error rate using Phred scale.

Java Parameters

-Xmx
This will set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory.
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions.

Examples

Basic tile dump processing

tiledump.sh in=tiles.dump out=processed_tiles.dump

Process a tile dump file with default parameters.

Widen tiles by dimensions

tiledump.sh in=tiles.dump out=widened_tiles.dump x=100 y=100

Widen tiles to at least 100x100 pixels in each dimension.

Filter tiles with custom thresholds

tiledump.sh in=tiles.dump out=filtered_tiles.dump \
    qdeviations=3.0 udeviations=2.0 maxbadfraction=0.3

Apply stricter quality filtering and limit discarded tiles to 30%.

Process with verbose output and blurring

tiledump.sh in=tiles.dump out=smooth_tiles.dump \
    verbose=t blur=t alignedreads=500

Process tiles with verbose output, apply smoothing, and use higher aligned read threshold.

Algorithm Details

Tile Processing Workflow

TileDump implements a multi-stage tile processing and filtering system using FlowCell and MicroTile data structures for Illumina sequencing data:

1. Tile Widening Strategy

2. Statistical Analysis

The tool calculates comprehensive statistics for each micro-tile including:

3. Multi-criteria Filtering

Tiles are evaluated using multiple independent criteria:

4. Adaptive Threshold Management

The filtering system uses a three-tier threshold approach:

5. Maximum Discard Protection

To prevent over-filtering, the system:

6. Error Rate Modeling

Advanced error rate prediction using:

Performance Characteristics

Support

For questions and support: