🧬 K-mer Cardinality Estimation

Estimate unique k-mer counts in FASTA/FASTQ files using the LogLog probabilistic algorithm.

Perfect for quality assessment, coverage estimation, contamination detection, and assembly planning.

🔒 Privacy-Preserving: All processing happens entirely in your browser using WebAssembly. Your files never leave your computer - nothing is uploaded to any server.

Parameters

K-mer Size:

Common values: 21 (assembly), 31 (standard), 15 (short reads). Max: 31 for 64-bit encoding.

Accuracy (buckets):

Higher bucket counts = more accuracy but slightly slower. Memory usage is tiny regardless (~16KB max).

📁

Drag and drop FASTA/FASTQ file here

or browse files

Supports files of any size - processing is memory-efficient!

Supports both compressed (.gz) and uncompressed files

About LogLog Algorithm

LogLog is a probabilistic cardinality estimation algorithm that uses fixed memory regardless of input size. It estimates unique k-mer counts by analyzing hash value distributions, providing fast approximate counts with predictable error bounds.

Use cases: Quality control, genome size estimation, contamination detection, coverage analysis, assembly planning.

Estimated Unique K-mers:	-
K-mer Size:	-
Bucket Count:	-
File Size:	-
Processing Time:	-

🧬 K-mer Cardinality Estimation

Parameters

Results:

About LogLog Algorithm