Start RefSeq Server VM

Script: startRefseqServerVM.sh Source Directory: pipelines/server/ Author: Brian Bushnell

Server startup script for launching a RefSeq taxonomy server with sketch-based taxonomic identification capabilities. This script is designed to run on JGI's web infrastructure (jgi-web-4) and provides taxonomic classification services via HTTP API.

Overview

The startRefseqServerVM.sh script launches a high-memory taxonomy server specifically configured for RefSeq database operations. It uses BBTools' taxserver.sh with optimized parameters for sketch-based taxonomic identification using dual k-mer lengths (k=32,24) and includes security features for remote management.

Important: This script is designed for JGI's internal infrastructure and runs on jgi-web-4. It requires specific file paths and network configurations that may not be available in other environments.

Server Configuration

Hardware Requirements

Service Parameters

Parameter Value Description
Port 3072 HTTP service port for API access
Domain https://refseq-sketch.jgi.doe.gov Public domain for service access
Database RefSeq RefSeq taxonomic database
K-mer lengths k=32,24 Dual k-mer strategy for sensitivity/specificity
Memory allocation 28GB Java heap size with 90% preallocation

Security Features

Remote Management

The server includes security features for remote administration:

Security Note: The actual password is redacted in the script (shown as "xxxxx"). In production, this should be a secure, randomly generated password.

Server Launch Command

Production Configuration

nohup /global/projectb/sandbox/gaag/bbtools/jgi-bbtools/taxserver.sh \
    -da -Xmx28g \
    prealloc=0.9 \
    port=3072 \
    verbose \
    tree=auto \
    sizemult=2 \
    sketchonly \
    index \
    domain=https://refseq-sketch.jgi.doe.gov \
    killcode=xxxxx \
    oldcode=xxxxx \
    oldaddress=https://refseq-sketch.jgi.doe.gov/kill/ \
    RefSeq \
    k=32,24 \
    1>>refseqlogVM_32.txt 2>&1 &

Parameter Explanation

Testing Configuration

The script includes a simplified testing configuration (commented out by default):

# Simple mode for testing:
nohup /global/projectb/sandbox/gaag/bbtools/jgi-bbtools/taxserver.sh \
    -ea -Xmx28g \
    port=3072 \
    verbose \
    tree=auto \
    sizemult=2 \
    sketchonly \
    RefSeq \
    k=32,24 \
    index=t

Testing vs Production Differences

Service Capabilities

Sketch-Based Taxonomic Identification

The server provides high-performance taxonomic classification using:

Expected Use Cases

Monitoring and Maintenance

Log Files

Server Management

Prerequisites

System Requirements

File Dependencies

Usage Examples

Starting the Server

# Navigate to the script directory
cd /path/to/pipelines/server/

# Launch the RefSeq server
bash startRefseqServerVM.sh

# Verify server is running
ps aux | grep taxserver

# Check the log for startup messages
tail -f refseqlogVM_32.txt

Testing Server Response

# Basic health check
curl https://refseq-sketch.jgi.doe.gov

# Test taxonomic query (example)
curl -X POST https://refseq-sketch.jgi.doe.gov/query \
     -d "sequence=ATCGATCGATCG..."

Server Shutdown

# Remote shutdown (requires password)
curl https://refseq-sketch.jgi.doe.gov/kill/[password]

# Or kill the process directly
pkill -f "taxserver.sh.*RefSeq"

Troubleshooting

Common Issues

Performance Monitoring

Related Tools