JavaSetup
Parses Java command-line arguments and sets up paths for BBTools execution environment. This script configures JVM parameters, memory allocation, and environment-specific paths for optimal BBTools performance.
Basic Usage
source javasetup.sh [parameters]
# Or get Java command directly:
javasetup.sh [parameters]
This script is typically sourced by other BBTools scripts to configure the Java environment. It can also be run directly to output the complete Java command with all configured parameters.
Parameters
Parameters control Java Virtual Machine settings, memory allocation, and environment-specific configurations.
Memory Configuration
- --mem=4g
- Default memory size for automatic memory detection. Used when no explicit Xmx is set. Default: 4g
- --percent=84
- Percentage of available system memory to use for automatic memory detection. Default: 84
- --mode=auto
- Memory allocation mode. Options: auto (automatic detection), partial, fixed. Default: auto
- Xmx=size
- Maximum heap memory size. Accepts formats like 8g, 4096m, etc. Sets -Xmx flag
- xmx=size
- Alternative format for maximum heap memory size (case-insensitive)
- -Xmx=size
- Standard Java format for maximum heap memory size
- -xmx=size
- Alternative standard format for maximum heap memory size
- Xms=size
- Initial heap memory size. Sets -Xms flag. If only Xmx is set, Xms automatically matches Xmx
- -Xms=size
- Standard Java format for initial heap memory size
JVM Configuration
- -ea
- Enable assertions in the JVM. Default setting
- ea
- Enable assertions (alternative format)
- -da
- Disable assertions in the JVM
- da
- Disable assertions (alternative format)
- ExitOnOutOfMemoryError
- Enable JVM to exit when out of memory occurs. Sets -XX:+ExitOnOutOfMemoryError
- exitonoutofmemoryerror
- Alternative format for enabling exit on out of memory (case-insensitive)
- eoom
- Short form for ExitOnOutOfMemoryError
- -ExitOnOutOfMemoryError
- Standard Java format for enabling exit on out of memory
- -exitonoutofmemoryerror
- Alternative standard format for exit on out of memory
- -eoom
- Standard format short form for ExitOnOutOfMemoryError
Performance Parameters
- simd
- Enable SIMD (Single Instruction, Multiple Data) vector operations. Adds --add-modules jdk.incubator.vector
- SIMD
- Enable SIMD vector operations (case-insensitive)
- simd=t
- Enable SIMD vector operations (explicit true)
- simd=true
- Enable SIMD vector operations (explicit true)
Output Control
- json
- Enable JSON output format. Sets internal json flag to 1
- json=t
- Enable JSON output format (explicit true)
- json=true
- Enable JSON output format (explicit true)
- format=json
- Set output format to JSON
- silent
- Enable silent mode, suppressing non-essential output. Sets internal silent flag to 1
- silent=t
- Enable silent mode (explicit true)
- silent=true
- Enable silent mode (explicit true)
Environment Detection
The script automatically detects and configures paths for specific execution environments:
Supported Environments
- Shifter Container
- Detected via SHIFTER_RUNTIME=1 environment variable. Configures container-specific settings
- AWS Environment
- Detected via EC2_HOME environment variable. Adds paths for:
- /test1/binaries/bgzip
- /test1/binaries/lbzip2/bin
- /test1/binaries/sambamba
- /test1/binaries/pigz2/pigz-2.4
- NERSC Environment
- Detected via NERSC_HOST environment variable. Adds paths for:
- /global/cfs/cdirs/bbtools/bgzip
- /global/cfs/cdirs/bbtools/lbzip2/bin
- /global/cfs/cdirs/bbtools/samtools116/samtools-1.16.1
- /global/cfs/cdirs/bbtools/java/jdk-17/bin
- /global/cfs/cdirs/bbtools/pigz2/pigz-2.4
Examples
Basic Memory Configuration
# Set 8GB maximum heap memory
javasetup.sh Xmx=8g
# Set both initial and maximum heap memory
javasetup.sh Xms=2g Xmx=8g
Configures JVM with specific memory allocation.
Performance Optimization
# Enable SIMD operations for better performance
javasetup.sh simd Xmx=8g
# Configure for high-memory analysis with error handling
javasetup.sh Xmx=32g eoom simd
Optimizes JVM settings for high-performance computing.
Automatic Memory Detection
# Use 90% of available system memory
javasetup.sh --percent=90
# Set default memory size for automatic detection
javasetup.sh --mem=8g --percent=80
Configures automatic memory detection based on system resources.
Sourcing in Other Scripts
# Source in BBTools script for environment setup
source "$SCRIPT_DIR/javasetup.sh"
# Get Java command for execution
JAVA_CMD=$(javasetup.sh Xmx=8g simd)
$JAVA_CMD -cp $CP my.package.MainClass $@
Integration pattern used by other BBTools scripts.
Algorithm Details
Memory Configuration Strategy
The script implements intelligent memory management with these features:
- Automatic Xms/Xmx Synchronization: When only one memory parameter is set, the script automatically matches the other to prevent memory fragmentation
- System Memory Detection: Uses memdetect.sh to determine available system memory when no explicit Xmx is provided
- Percentage-based Allocation: Calculates memory allocation as percentage of total system memory for optimal resource utilization
- Mode-specific Behavior: Supports auto, partial, and fixed memory allocation modes for different use cases
Environment Path Management
The script provides environment-specific PATH modifications to ensure optimal tool availability:
- Container Detection: Identifies containerized environments (Shifter) and adjusts configuration accordingly
- HPC Integration: Configures paths for High Performance Computing environments like NERSC with pre-installed optimized tools
- Cloud Platform Support: Provides AWS-specific path configurations for cloud-based bioinformatics workflows
- Tool Chain Coordination: Ensures compatibility with external tools like bgzip, lbzip2, sambamba, and pigz for efficient data processing
Parameter Processing Logic
The parseJavaArgs() function implements iterative argument processing with shell parameter expansion (${arg%%=*}) and cut-based value extraction:
- Format Flexibility: Accepts multiple formats for the same parameter (Xmx, -Xmx, xmx, -xmx)
- Case Insensitivity: Handles various case combinations for user convenience
- Precedence Rules: Tool-specific memory settings are processed before general JVM settings
- State Tracking: Maintains flags to prevent conflicting configurations and ensure consistent setup
Integration Architecture
The script is designed as a foundational component for the BBTools ecosystem:
- Source vs Execute: Can be sourced for environment setup or executed directly for command generation
- Memory Detection Dependency: Leverages memdetect.sh for accurate system memory assessment
- Global Variable Management: Uses well-defined global variables (XMX, XMS, EA, EOOM, SIMD) for state sharing
- Portable Design: Written for maximum shell compatibility across different Unix-like systems
Performance Considerations
The script optimizes Java performance through:
- SIMD Support: Enables Java vector operations for computational bioinformatics algorithms
- Memory Efficiency: Matches initial and maximum heap sizes to reduce GC overhead
- Error Handling: Configures JVM to exit cleanly on memory errors rather than hanging
- Assertion Control: Provides fine-grained control over Java assertion behavior for debugging vs production
Implementation Notes
Script Structure
The script follows a modular design pattern:
- parseJavaArgs(): Processes command-line arguments and sets global variables
- setEnvironment(): Configures environment-specific PATH modifications
- getJavaCommand(): Combines parsed arguments into complete Java command
Memory Detection Integration
Depends on memdetect.sh for:
- System memory detection across different operating systems
- Available memory calculation considering system reserves
- Percentage-based memory allocation
Global Variables
The script maintains these global state variables:
- XMX: Maximum heap memory setting
- XMS: Initial heap memory setting
- EA: Assertion enable/disable flag
- EOOM: Exit on out of memory error setting
- SIMD: Vector operations module setting
- json: JSON output format flag
- silent: Silent mode flag
Version History
Current version 1.0.3 includes contributions from:
- Brian Bushnell (primary author)
- Doug Jacobsen (environment integration)
- Alex Copeland (memory management)
- Bryce Foster (performance optimization)
- Claude AI (documentation and refinement)
Support
For questions and support:
- Email: bbushnell@lbl.gov
- Documentation: bbmap.org
- GitHub: BBTools Repository