Profile

Script: profile.sh Package: utilities

Runs any BBTools Java class with Java Flight Recorder profiling enabled for performance analysis and optimization.

Description

Profile is a wrapper script that executes any BBTools program with Java Flight Recorder (JFR) profiling enabled. JFR is Java's built-in, low-overhead profiling tool that captures detailed performance data including:

The output .jfr file can be analyzed using Java Mission Control or other JFR-compatible tools to identify performance bottlenecks and optimize BBTools workflows.

Basic Usage

# Profile BBMap alignment
profile.sh align2.BBMap in=reads.fq ref=genome.fa profile=mapping.jfr -Xmx8g

# Profile SAM streaming
profile.sh stream.StreamerWrapper in=alignments.bam profile=streaming.jfr

# Profile dedupe operation
profile.sh jgi.Dedupe in=reads.fq out=deduped.fq profile=dedupe.jfr

The only required parameter is profile=<filename.jfr>. The first non-flag argument must be the fully qualified Java class name.

Parameters

Profiling parameters

profile=<file>
Output JFR file (required). This file contains the profiling data and can be opened in Java Mission Control for analysis.
maxsize=<size>
Maximum size of the JFR recording file (default: 2g). When this limit is reached, older events are discarded. Use larger values for long-running analyses.

Java parameters

-Xmx<size>
Java heap size (optional, default: 2g). Example: -Xmx8g for 8 gigabytes. This is passed to the target BBTools program.
<classname>
Fully qualified Java class to run (required, first non-flag argument). Examples: align2.BBMap, jgi.Dedupe, stream.StreamerWrapper
<arguments>
All other parameters are passed directly to the target class. Use the same parameters you would normally use with that tool.

Examples

Profile read mapping

profile.sh align2.BBMap in=reads.fq ref=genome.fa out=mapped.sam profile=bbmap_profile.jfr -Xmx16g

Profiles BBMap alignment with 16GB heap, capturing CPU and memory usage patterns during mapping.

Profile k-mer counting

profile.sh jgi.KmerCountExact in=reads.fq khist=histogram.txt profile=kmer_profile.jfr

Profiles exact k-mer counting to identify performance bottlenecks in hash table operations.

Profile deduplication

profile.sh jgi.Dedupe in=reads.fq out=deduped.fq profile=dedupe_profile.jfr -Xmx12g

Profiles optical and sequencing duplicate removal to optimize memory and thread usage.

Analyzing Results

The output .jfr file can be analyzed using several tools:

View summary with jfr tool

# Print profiling summary
jfr print --events jdk.CPUSample profile.jfr

# Generate HTML report
jfr print --events jdk.CPUSample,jdk.AllocationSample profile.jfr > report.html

Notes