SynthMDA

Script: synthmda.sh Package: synth Class: SynthMDA.java

Generates synthetic reads by simulating Multiple Displacement Amplification (MDA) through iterative stochastic fragment selection and amplification, producing characteristic uneven coverage distributions found in single-cell genomics.

Basic Usage

synthmda.sh in=<reference> out=<reads out file>

Input may be fasta or fastq, compressed or uncompressed.

Parameters

Parameters control the MDA amplification simulation and read generation process.

Parameters

reads=12000000
Generate this many reads.
paired=t
Generate paired reads.
length=150
Reads should be this long.
minlen=4000
Min length of MDA contig.
maxlen=150000
Max length of MDA contig.
cycles=9
Number of MDA cycles; higher is more spiky. Each cycle calls amplify() method with the specified ratio parameter.
initialratio=1.3
Fraction of genome initially replicated in first amplification round; lower values create more spiky coverage patterns.
ratio=1.7
Fraction of genome replicated per cycle after initial round. Default differs from shell (1.7) vs Java code (2.0).
refout=null
Write MDA'd genome to this file. If null, generates temporary filename using ReadWrite.stripToCore() + random hex suffix.
perfect=0
Fraction of reads that will be error-free. Passed as 'perfect' parameter to RandomReads3.main().
amp=200
'amp' flag sent to RandomReads3 for coverage distribution skewing (higher values create more uneven coverage).
build=7
Build index location for MDA'd genome. Parsed by Parser.build and passed to RandomReads3.
prefix=null
Generated reads will start with this prefix.
overwrite=t
(ow) Set to false to force the program to abort rather than overwrite an existing file.

Java Parameters

-Xmx
This will set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory.
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions.

Examples

Basic MDA Simulation

synthmda.sh in=genome.fa out=mda_reads.fq reads=10000000

Generate 10 million synthetic reads from a reference genome with default MDA parameters.

High Coverage Spiky Distribution

synthmda.sh in=reference.fa out=reads.fq cycles=12 initialratio=1.1 amp=400

Create a more uneven coverage distribution with higher cycles and amp values.

Single-End Reads with Custom Length

synthmda.sh in=genome.fa out=reads.fq paired=f length=250 reads=5000000

Generate 5 million single-end reads of 250bp length.

Algorithm Details

SynthMDA simulates Multiple Displacement Amplification (MDA) through the amplify() method using ByteBuilder data structures and Shared.threadLocalRandom() for stochastic fragment generation.

Reference Processing

Amplification Algorithm

The amplify() method implements the core MDA simulation:

Read Generation

After amplification cycles complete, synthetic reads are generated via RandomReads3.main():

Memory Management

The resulting coverage distribution exhibits the characteristic uneven, spiky pattern of MDA-amplified single-cell genomes through the stochastic fragment selection and variable amplification ratios per cycle.

Support

For questions and support: