startTaxServerVM

Script: startTaxServerVM.sh Source Directory: pipelines/server/ Author: Brian Bushnell

Virtual machine startup script for launching the BBTools taxonomy server with pre-configured settings optimized for JGI infrastructure deployment.

Purpose

This script is designed to start the BBTools taxonomy server in a virtual machine environment, specifically for deployment on jgi-web-1. It provides a production-ready configuration with:

Prerequisites

System Requirements

Required Files

Configuration

The script uses hardcoded configuration values optimized for the JGI production environment:

Server Configuration

LOG=taxlogVM_55.txt
Log file for all server output, including startup messages, query logs, and error information.
PASS=xxxxx
Security password for server management operations (masked in source). Used for both kill code and old instance cleanup.
DOMAIN=https://taxonomy.jgi.doe.gov
Base domain URL displayed in server help messages and used for client redirects.
KILL=https://taxonomy.jgi.doe.gov/kill/
Endpoint URL for remotely terminating previous server instances during startup.
PORT=3068
HTTP port number for the taxonomy server. Standard port for JGI taxonomy services.

Java Configuration

-da
Disables Java assertions for improved production performance.
-Xmx31g
Sets maximum Java heap size to 31GB, optimized for VM memory constraints while leaving system memory for OS operations.

TaxServer Parameters

port=3068
HTTP server port number.
verbose
Enable detailed logging of server operations and query processing.
accession=auto
Automatically detect and load accession-to-taxonomy mapping files from default JGI locations.
tree=auto
Automatically detect and load NCBI taxonomy tree from default JGI location.
table=auto
Automatically detect and load GI-to-taxonomy table from default JGI location.
size=auto
Automatically detect and load genome size information from default JGI location.
img=auto
Automatically detect and load IMG database integration files from default JGI location.
pattern=auto
Automatically detect and load pattern files for efficient accession storage.
prealloc
Enable preallocation of data structures for faster server initialization and reduced memory fragmentation.
domain=https://taxonomy.jgi.doe.gov
Domain name displayed in server help messages and API responses.
killcode=xxxxx
Password for secure remote server termination via /kill/ endpoint.
oldcode=xxxxx
Password for terminating previous server instances during startup.
oldaddress=https://taxonomy.jgi.doe.gov/kill/
URL endpoint for sending termination commands to previous server instances.
html
Enable HTML formatting in server responses for web browser compatibility.

Usage

Production Deployment

# On jgi-web-1 system:
./startTaxServerVM.sh

Starts the taxonomy server in background with production configuration. The server will:

  • Launch in background using nohup for persistence
  • Load all taxonomy data automatically from JGI standard locations
  • Listen on port 3068 for HTTP requests
  • Log all activity to taxlogVM_55.txt
  • Attempt to terminate any existing server instances

Testing Mode

The script includes a commented testing configuration:

# For testing purposes only (commented out in production):
/global/projectb/sandbox/gaag/bbtools/jgi-bbtools/taxserver.sh -ea -Xmx8g port=3068 verbose accession=null tree=auto table=null

This testing mode uses reduced memory (8GB) and minimal data loading for development and debugging.

Process Management

Background Execution

The script uses nohup to ensure the server continues running even after the terminal session ends:

Server Lifecycle

  1. Cleanup: Attempts to terminate previous server instances using oldcode/oldaddress
  2. Initialization: Loads taxonomy data files (may take several minutes)
  3. Service: Begins accepting HTTP requests on configured port
  4. Logging: All operations logged continuously to taxlogVM_55.txt
  5. Termination: Can be stopped remotely using /kill/ endpoint with proper credentials

Monitoring and Troubleshooting

Log Monitoring

# Monitor server startup and activity:
tail -f taxlogVM_55.txt

# Check for errors during initialization:
grep -i error taxlogVM_55.txt

# Monitor server performance:
grep -i "memory\|heap\|gc" taxlogVM_55.txt

Process Status

# Check if server is running:
ps aux | grep TaxServer

# Check port availability:
netstat -ln | grep 3068

# Test server response:
curl http://localhost:3068/help

Common Issues

Security Considerations

Access Control

Production Hardening

Related Tools

Algorithm Details

Startup Process

The script implements a robust server startup procedure:

  1. Environment Check: Validates system requirements and file paths
  2. Previous Instance Cleanup: Sends kill command to any existing server using oldcode/oldaddress
  3. Data Loading: Automatically detects and loads all required taxonomy data files:
    • NCBI taxonomy tree structure
    • GI number to taxonomy ID mappings
    • Accession number to taxonomy ID mappings
    • Genome size information
    • IMG database integration files
    • Compressed pattern files for efficiency
  4. Memory Optimization: Preallocates data structures to minimize memory fragmentation
  5. Service Activation: Binds to port 3068 and begins accepting HTTP requests
  6. Background Execution: Detaches from terminal using nohup for persistent operation

Performance Characteristics

Fault Tolerance

Notes

Support

For questions and support: