MuxByName

Script: muxbyname.sh Package: driver Class: RenameAndMux.java

Multiplexes reads from multiple files after renaming them based on their initial file. Opposite of demuxbyname.

Basic Usage

muxbyname.sh in=<file,file,file...> out=<output file>

Input files may also be given without an in= prefix, allowing wildcards:

muxbyname.sh *.fastq out=muxed.fastq

Parameters

Parameters are organized into logical groups based on their function in the multiplexing process. MuxByName supports both single-end and paired-end reads with automatic format detection.

Standard parameters

in=<file,file>
A list of input files separated by commas. Can specify multiple FASTQ or FASTA files to be multiplexed together. Files can be gzipped.
in2=<file,file>
Read 2 input if reads are in paired files. Specify corresponding mate files for paired-end sequencing data.
out=<file>
Primary output, or read 1 output. All multiplexed reads will be written to this file with renamed headers.
out2=<file>
Read 2 output if reads are in paired files. Required when processing paired-end data with separate mate files.
overwrite=f
(ow) Set to false to force the program to abort rather than overwrite an existing file. Default: false (will not overwrite).
showspeed=t
(ss) Set to 'f' to suppress display of processing speed. Shows reads processed per second during execution. Default: true.
ziplevel=2
(zl) Set to 1 (lowest) through 9 (max) to change compression level; lower compression is faster. Only applies to gzipped output files. Default: 2.

Java Parameters

-Xmx
This will set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory. Default: 400m.
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions. Can slightly improve performance in production environments.

Examples

Basic Multiplexing

muxbyname.sh in=sample1.fq,sample2.fq,sample3.fq out=multiplexed.fq

Multiplexes three single-end FASTQ files into one output file. Each read will be renamed with its source filename as a prefix.

Using Wildcards

muxbyname.sh *.fastq out=all_samples.fastq

Multiplexes all FASTQ files in the current directory. Input files are specified without the in= prefix when using wildcards.

Paired-End Reads

muxbyname.sh in=sample1_R1.fq,sample2_R1.fq in2=sample1_R2.fq,sample2_R2.fq out=mux_R1.fq out2=mux_R2.fq

Multiplexes paired-end reads from multiple samples while maintaining mate pair relationships.

Hash Symbol Notation for Paired Files

muxbyname.sh sample1_#.fq,sample2_#.fq out=multiplexed_#.fq

Uses hash symbol (#) notation where # is automatically replaced with 1 and 2 for paired files. This is a shorthand for specifying paired-end files.

Compressed Files with Custom Settings

muxbyname.sh in=sample1.fq.gz,sample2.fq.gz out=muxed.fq.gz ziplevel=6 overwrite=t

Processes gzipped input files and creates gzipped output with higher compression level, allowing overwrite of existing output files.

Algorithm Details

MuxByName implements read multiplexing and renaming through the RenameAndMux.java class using multithreaded file processing:

Core Implementation

Multithreading Architecture

File Format Processing

Memory and Performance

Validation and Error Handling

Relationship to DemuxByName: This tool performs the inverse operation of demuxbyname.sh. While demuxbyname splits multiplexed files based on read names, muxbyname combines separate files while adding identifying prefixes to read names using the core filename extraction and numeric ID preservation strategy implemented in the renameAndMergeOneFile() method.

Support

For questions and support: