LoadReads

Script: loadreads.sh Package: driver Class: LoadReads.java

Tests the memory usage of a sequence file by loading reads into memory and measuring memory consumption patterns.

Basic Usage

loadreads.sh in=<file>

Loads all reads from a sequence file into memory and reports detailed memory usage statistics including initial/final memory consumption, memory ratios, and per-read memory overhead.

Parameters

Input Parameters

in=file
Input file. Can be any sequence format supported by BBTools (FASTA, FASTQ, gzipped files). Required parameter.
lowcomplexity=f
Assume input library is low-complexity. This affects memory estimation calculations by assuming sequences have fewer unique k-mers and may compress better in memory. Default: false.
verbose=f
Print detailed processing information including read batches fetched and returned. Default: false.
earlyexit=f
Exit early during memory estimation calculations. Default: false.
gc=f
Perform garbage collection at the end and report memory usage after GC. Default: false.
overhead=0
Override the overhead parameter in memory estimation calculations. Default: 0 (auto-detect).

Java Parameters

-Xmx
Sets Java's maximum memory usage, overriding autodetection. Examples: -Xmx20g specifies 20 gigabytes of RAM, -Xmx200m specifies 200 megabytes. The maximum is typically 85% of physical memory.
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions for potentially faster execution.

Examples

Basic Memory Testing

loadreads.sh in=reads.fastq

Tests memory usage of a FASTQ file by loading all reads and reporting memory consumption statistics.

Low-Complexity Library Testing

loadreads.sh in=amplicon_reads.fq lowcomplexity=t

Tests memory usage assuming the input is a low-complexity library (such as amplicon data) which may have different memory characteristics.

With Custom Memory Settings

loadreads.sh -Xmx8g in=large_dataset.fq.gz

Tests memory usage with 8GB maximum heap size for a large gzipped dataset.

Algorithm Details

Memory Testing Strategy

LoadReads implements memory testing using ArrayList<ArrayList<Read>> storage structure with ConcurrentReadInputStream for I/O operations:

Data Loading Process

Memory Measurement Methodology

Estimation Algorithms

Output Metrics

LoadReads outputs specific memory metrics calculated from tracked variables:

Low-Complexity Optimization

When lowcomplexity=true is specified, the tool adjusts its estimation algorithms to account for:

Performance Characteristics

Technical Notes

Memory Management

Input File Support

Output Interpretation

Support

For questions and support: