CalcMem
Calculates available system memory in megabytes using /proc/meminfo parsing and ulimit checks. Parses Java memory parameters (Xmx/Xms) and runtime flags. Used internally by BBTools scripts to determine memory allocation via freeRam() and parseXmx() functions.
Basic Usage
calcmem.sh [parameters]
This script is called internally by other BBTools scripts via calcmem.sh parsing functions. When executed directly, it runs the freeRam() function to calculate available memory and parseXmx() to process Java flags.
Parameters
The parseXmx() function processes command-line arguments using pattern matching to extract Java memory flags and runtime parameters.
Memory Parameters
- Xmx=<size>
- Sets maximum Java heap memory size. Accepts values like '8g', '4096m', or '4194304k'. Can be specified with or without leading dash (-Xmx).
- xmx=<size>
- Alternative lowercase format for maximum heap memory. Equivalent to Xmx parameter.
- Xms=<size>
- Sets initial Java heap memory size. If only Xmx is specified, Xms is automatically set to the same value for consistent performance.
- xms=<size>
- Alternative lowercase format for initial heap memory. Equivalent to Xms parameter.
Java Runtime Parameters
- ea
- Enable Java assertions. Useful for debugging and development but may impact performance in production.
- da
- Disable Java assertions. Default behavior for production runs to maximize performance.
- ExitOnOutOfMemoryError
- Enables -XX:+ExitOnOutOfMemoryError JVM flag. Forces JVM to exit immediately when OutOfMemoryError occurs instead of attempting recovery. Can also be specified as 'exitonoutofmemoryerror' or 'eoom'.
Output and Performance Parameters
- json
- Output memory information in JSON format. Can be specified as 'json=t', 'json=true', or 'format=json'.
- silent
- Suppress informational output during memory calculation. Can be specified as 'silent=t' or 'silent=true'.
- simd
- Enable SIMD (Single Instruction, Multiple Data) vectorization support by adding '--add-modules jdk.incubator.vector' to Java command line. Improves performance on compatible processors. Can be specified as 'simd=t' or 'simd=true'.
Environment Variables
- RQCMEM
- Manual memory override in megabytes. When set to a value greater than 0, forces memory calculation to use this value instead of system detection.
- SLURM_MEM_PER_NODE
- Automatically detected SLURM memory allocation per node in megabytes. Used on SLURM-managed clusters to respect job memory limits.
Examples
Memory Calculation
# Calculate available memory (typically called by other scripts)
calcmem.sh
# With specific heap size
calcmem.sh Xmx=8g
# Enable assertions and JSON output
calcmem.sh ea json
These examples show direct usage, though calcmem.sh is usually invoked internally by other BBTools scripts.
Integration with BBTools
# BBTools scripts automatically use calcmem.sh
bbduk.sh in=reads.fq out=clean.fq # calcmem.sh calculates available memory
# Override memory manually
bbduk.sh in=reads.fq out=clean.fq Xmx=16g # calcmem.sh processes this setting
Most BBTools scripts source calcmem.sh to calculate memory allocation via freeRam() and parse Java flags via parseXmx().
Algorithm Details
Memory Detection Strategy
CalcMem implements a multi-source memory detection system using the freeRam() function with these specific data sources:
System Memory Sources
- Virtual Memory (vfree): Parses /proc/meminfo using awk pattern matching for CommitLimit and Committed_AS fields, calculates vfree = (CommitLimit - Committed_AS)
- Physical Memory (pfree): Parses /proc/meminfo for MemFree, Cached, and Buffers fields, calculates pfree = (MemFree + Cached + Buffers)
- Process Limits (ulimit): Executes ulimit -v command to retrieve process virtual memory limit, handles "unlimited" string by setting to 0
- SLURM Integration (slurm_x): Reads SLURM_MEM_PER_NODE environment variable, converts to kilobytes via slurm_x = (SLURM_MEM_PER_NODE * 1024)
Memory Calculation Logic
The script implements memory selection using conditional logic in lines 159-191:
- Dual-source comparison: If both vfree > 0 and pfree > 0, selects x2 = min(vfree, pfree)
- Scheduler override: If SLURM_MEM_PER_NODE > 0 and (x2 > slurm_x or x2 == 0), forces x = slurm_x
- Limit enforcement: If ulimit != "unlimited" and ulimit < x2, sets x = x2
- Final calculation: RAM = ((x - 500000) * mult / 100) / 1024, where mult = 84% by default
- Fallback value: When x < 1, uses defaultMem = 3200000 KB (3.2 GB)
Environment-Specific Configuration
The setEnvironment() function (lines 73-102) configures PATH variables based on environment detection:
- NERSC Systems: When NERSC_HOST is set, prepends /global/cfs/cdirs/bbtools/[tool]/bin paths for bgzip, lbzip2, samtools-1.16.1, java/jdk-17, and pigz-2.4
- AWS Taxonomy Server: When EC2_HOME is unset, adds /test1/binaries/[tool] paths for bgzip, lbzip2, sambamba, and pigz2
- Shifter Containers: When SHIFTER_RUNTIME=1, sets shifter=1 flag and ignores NERSC_HOST detection
- Default Systems: No PATH modifications when none of the above conditions match
Parameter Processing
The parseXmx() function (lines 11-71) uses pattern matching to handle argument variations:
- Memory Format Parsing: Recognizes Xmx/xmx patterns with optional leading dash, extracts value using ${arg:N} substring operations
- Automatic Xms/Xmx Synchronization: When setxmx=1 and setxms=0, extracts substring after 'x' and creates z2="-Xms$substring" (lines 59-62)
- Assertion Flag Handling: Converts "da"/"ea" to "-da"/"-ea", stores in EA variable for Java command line
- SIMD Flag Processing: Maps "simd"/"SIMD" patterns to SIMD="--add-modules jdk.incubator.vector" string
- OutOfMemoryError Handling: Converts "eoom"/"exitonoutofmemoryerror" to EOOM="-XX:+ExitOnOutOfMemoryError"
Technical Notes
Memory Unit Conversion
The freeRam() function (lines 112-125) implements unit parsing using case statements:
- Gigabyte conversion: Pattern "*g" triggers defaultMem=$(( $defaultMem * $(( 1024 * 1024 )) )) for KB conversion
- Megabyte conversion: Pattern "*m" triggers defaultMem=$(( $defaultMem * 1024 )) for KB conversion
- Kilobyte handling: Pattern "*k" uses cut -d'k' -f 1 to extract numeric value, no multiplication needed
- Raw numbers: Values without suffix are treated as kilobytes by default
Platform Compatibility
- Linux Systems: Reads /proc/meminfo using cat and awk for CommitLimit, Committed_AS, MemFree, Cached, Buffers
- SLURM Clusters: Checks SLURM_MEM_PER_NODE environment variable, overrides system detection when present
- Shifter Containers: Detects SHIFTER_RUNTIME=1, bypasses NERSC_HOST path configuration
- Genepool Systems: Special case when HOSTNAME matches "genepool*" pattern, forces 3.2GB default
- Systems without /proc/meminfo: Falls back to 3.2GB default when memory detection fails
Memory Safety Parameters
The final calculation RAM = ((x - 500000) * 84 / 100) / 1024 implements two safety mechanisms:
- 500MB reservation (500000 KB): Subtracts 500MB before percentage calculation to preserve system memory
- 84% utilization factor: Default mult=84 parameter allows override via second argument to freeRam()
- KB to MB conversion: Final division by 1024 converts result from kilobytes to megabytes
- Integer arithmetic: Uses $(( )) bash arithmetic for all calculations, avoiding floating point
Support
For questions and support:
- Email: bbushnell@lbl.gov
- Documentation: bbmap.org