WebCheck

Script: webcheck.sh Package: driver Class: ProcessWebcheck.java

Parses a webcheck log and generates statistics about web server response times, status codes, and failure patterns. Analyzes latency metrics and provides detailed reporting on web service performance.

Basic Usage

webcheck.sh <input files>

Webcheck processes log files containing web server monitoring data. Input files should contain pipe-delimited records with timestamp, URL, status code, and latency information.

Expected Input Format

Input is expected to look like this:

Tue Apr 26 16:40:09 2016|https://rqc.jgi-psf.org/control/|200 OK|0.61

Each line contains four pipe-separated fields: timestamp, URL, status code with message, and latency in seconds.

Parameters

Parameters are organized by their function in the webcheck analysis process.

Standard parameters

in=<file>
Primary input file containing webcheck log data. Can use a wildcard (*) if 'in=' is omitted. Multiple files can be processed sequentially.
out=<file>
Summary output file for statistics; optional. If not specified, results are written to stdout. Contains aggregated statistics for all processed log entries.
fail=<file>
Output file for failing lines (non-200 status codes); optional. Records all entries that indicate server errors, timeouts, or other failure conditions.
invalid=<file>
Output file for misformatted lines; optional. Captures log entries that don't match the expected four-field pipe-delimited format.
extendedstats=f
(es) Print more detailed statistics including line counts, latency averages, and observed failure codes. Default: false. Extended stats provide comprehensive analysis of log processing results.
overwrite=f
(ow) Set to false to force the program to abort rather than overwrite an existing file. Default: false (will overwrite). Prevents accidental data loss.

Processing Parameters

lines=unlimited
Maximum number of lines to process from input files. Set to a positive number to limit processing for testing or sampling. Default: unlimited (Long.MAX_VALUE).
ms=t
(millis) Control milliseconds display in latency output. Set to false to omit 'ms' suffix from timing statistics. Default: true (displays 'ms' suffix).
verbose=f
Enable verbose output for debugging and detailed processing information. Affects file I/O operations and stream processing. Default: false.

Java Parameters

-Xmx
This will set Java's memory usage, overriding autodetection. -Xmx20g will specify 20 gigs of RAM, and -Xmx200m will specify 200 megs. The max is typically 85% of physical memory.
-eoom
This flag will cause the process to exit if an out-of-memory exception occurs. Requires Java 8u92+.
-da
Disable assertions.

Examples

Basic Log Analysis

webcheck.sh webserver.log

Process a single webcheck log file and display summary statistics to stdout.

Comprehensive Analysis with Output Files

webcheck.sh in=webserver.log out=summary.txt fail=failures.log invalid=bad_entries.log extendedstats=t

Process webcheck log with full output: summary statistics to file, failed requests captured separately, malformed entries logged, and extended statistics enabled.

Multiple File Processing

webcheck.sh *.log out=combined_stats.txt extendedstats=t

Process all log files in the current directory, generating combined statistics with extended reporting.

Limited Processing for Testing

webcheck.sh webserver.log lines=1000 extendedstats=t

Process only the first 1000 lines of a log file for quick analysis and testing.

Algorithm Details

ProcessWebcheck implements a ByteFile-based streaming parser using process2() method for line-by-line log analysis. The implementation processes pipe-delimited entries through ByteFile.nextLine() iteration without memory pre-allocation, enabling analysis of arbitrarily large web server monitoring files within constant memory bounds.

Data Processing Strategy

ProcessWebcheck.process2() implements a three-phase approach:

Performance Characteristics

ProcessWebcheck implements memory-efficient processing using ByteFile.makeByteFile() for streaming I/O:

Statistical Analysis

ProcessWebcheck implements precision latency tracking with status code classification:

Output Format

ProcessWebcheck.process() generates tab-delimited output using StringBuilder with Shared.sort(list) for alphabetical status code ordering. Extended statistics are conditionally appended when extendedStats=true, including Lines_Processed, Invalid_Lines, Passing/Failing counts, Avg_Pass_Latency/Max_Pass_Latency calculations, and Observed_Fail_Codes enumeration from failCode IntList iteration.

Output Interpretation

Standard Output

Basic output shows status code counts in tab-separated format:

200 OK    1247
404 Not Found    23
500 Internal Server Error    5

Extended Statistics

When extendedstats=t, additional metrics are provided:

Support

For questions and support: