Skip to main content

Server Health Check

A practical server monitoring script that performs comprehensive health checks, logs issues, and executes automatic recovery procedures. This example demonstrates Utah's error handling, environment configuration, and system monitoring capabilities.

๐ŸŽฌ Interactive Demoโ€‹

Watch this script in action! The demo shows environment detection, privilege checking, resource monitoring, and intelligent auto-recovery procedures:

Features Demonstratedโ€‹

  • Environment-based configuration with sensible defaults
  • Privilege checking with sudo validation
  • Resource monitoring with dynamic thresholds
  • Auto-recovery procedures with environment-specific logic
  • File system operations with path manipulation
  • Comprehensive logging with timestamped entries
  • Error handling with graceful degradation

Complete Scriptโ€‹

// Server Health Check and Auto-Recovery Script
// This script monitors server health, logs issues, and performs automatic recovery

script.enableDebug();
script.exitOnError(); // Enable strict error handling for recovery

console.log("๐Ÿ” Starting comprehensive server health check...");

// Load configuration from environment file
const configFile: string = ".env";
const defaultLogLevel: string = env.get("LOG_LEVEL") ? env.get("LOG_LEVEL") : "INFO";
const maxRetries: number = env.get("MAX_RETRIES") ? env.get("MAX_RETRIES") : 3;
const alertEmail: string = env.get("ALERT_EMAIL") ? env.get("ALERT_EMAIL") : "admin@localhost";

console.log(`โš™๏ธ Configuration loaded - Log Level: ${defaultLogLevel}, Max Retries: ${maxRetries}`);

// Check if running with proper privileges
let hasAdminRights: boolean = console.isSudo();
if (!hasAdminRights) {
console.log("โŒ Error: This script requires sudo privileges for system operations");
console.log("Please run: sudo utah compile server-health-check.shx && sudo ./server-health-check.sh");
exit(1);
}

console.log("โœ… Running with administrator privileges");

// System information gathering
const serverOS: string = os.getOS();
const processId: number = process.id();
let currentMemory: number = process.memory();
let currentCPU: number = process.cpu();

// Use ternary operators for status evaluation
let memoryStatus: string = currentMemory > 80 ? "CRITICAL" : currentMemory > 60 ? "WARNING" : "OK";
let cpuStatus: string = currentCPU > 90 ? "CRITICAL" : currentCPU > 70 ? "WARNING" : "OK";

console.log(`๐Ÿ“Š Server OS: ${serverOS}`);
console.log(`๐Ÿ”ง Process ID: ${processId}`);
console.log(`๐Ÿ’พ Memory Usage: ${currentMemory}% (${memoryStatus})`);
console.log(`โšก CPU Usage: ${currentCPU}% (${cpuStatus})`);

// Determine server environment using switch statement
let environment: string = "unknown";
switch (serverOS) {
case "linux":
environment = "production";
break;
case "darwin":
environment = "development";
break;
case "windows":
environment = "testing";
break;
default:
environment = "unknown";
break;
}

console.log(`๐ŸŒ Environment detected: ${environment}`);

// Check if critical applications are installed
const applications: string[] = ["docker", "nginx", "git", "curl"];
let missingApps: string[] = [];
let appStatuses: string[] = [];

for (let app: string in applications) {
let isInstalled: boolean = os.isInstalled(app);
let status: string = isInstalled ? "โœ… INSTALLED" : "โŒ MISSING";

console.log(`${status}: ${app}`);

if (!isInstalled) {
array.push(missingApps, app);
}

array.push(appStatuses, `${app}: ${status}`);
}

// Application health assessment using switch
let appHealthLevel: string = "";
switch (missingApps.length) {
case 0:
appHealthLevel = "EXCELLENT";
break;
case 1:
appHealthLevel = "GOOD";
break;
case 2:
appHealthLevel = "FAIR";
break;
default:
appHealthLevel = "POOR";
break;
}

console.log(`๐Ÿ“ฆ Application Health: ${appHealthLevel} (${missingApps.length} missing)`);

// Resource monitoring and alerting with environment-based thresholds
const memoryThreshold: number = environment == "production" ? 80 : 90;
const cpuThreshold: number = environment == "production" ? 70 : 85;
let issuesFound: boolean = false;
let alertLevel: string = "INFO";

timer.start();

// Memory check with ternary operator for severity
if (currentMemory > memoryThreshold) {
alertLevel = currentMemory > 95 ? "CRITICAL" : "WARNING";
console.log(`โš ๏ธ HIGH MEMORY USAGE: ${currentMemory}% (threshold: ${memoryThreshold}%) - ${alertLevel}`);
issuesFound = true;

// Log the issue with timestamp
let timestamp: string = process.elapsedTime();
let logEntry: string = `[${timestamp}] ${alertLevel} MEMORY: ${currentMemory}%`;
fs.writeFile("/var/log/health-check.log", logEntry);
}

// CPU check with switch-based response
if (currentCPU > cpuThreshold) {
let cpuAction: string = "";

if (currentCPU > 95) {
cpuAction = "EMERGENCY_SHUTDOWN";
alertLevel = "CRITICAL";
}
else if (currentCPU > 85) {
cpuAction = "THROTTLE_SERVICES";
alertLevel = "WARNING";
}
else {
cpuAction = "MONITOR_CLOSELY";
alertLevel = "INFO";
}

console.log(`โš ๏ธ HIGH CPU USAGE: ${currentCPU}% (threshold: ${cpuThreshold}%) - Action: ${cpuAction}`);
issuesFound = true;

// Log the issue
let timestamp: string = process.elapsedTime();
let logEntry: string = `[${timestamp}] ${alertLevel} CPU: ${currentCPU}% - ${cpuAction}`;
fs.writeFile("/var/log/health-check.log", logEntry);
}

// Auto-recovery procedures with environment-specific logic
if (issuesFound) {
let recoveryMode: string = environment == "production" ? "CONSERVATIVE" : "AGGRESSIVE";
let shouldRecover: boolean = console.promptYesNo(`๐Ÿ”ง Issues detected (${alertLevel}). Attempt ${recoveryMode} recovery?`);

if (shouldRecover) {
console.log(`๐Ÿ”„ Starting ${recoveryMode} recovery procedures...`);

// Environment-based recovery strategy using switch
switch (environment) {
case "production":
console.log("๐Ÿ›ก๏ธ Production mode: Conservative recovery");
console.log("๐Ÿ“Š Monitoring system load...");
console.log("๐Ÿ”„ Graceful service restart...");
break;
case "development":
console.log("๐Ÿš€ Development mode: Aggressive optimization");
console.log("๐Ÿ’พ Clearing all caches...");
console.log("๐Ÿ”„ Hard service restart...");
break;
case "testing":
console.log("๐Ÿงช Testing mode: Diagnostic recovery");
console.log("๐Ÿ“‹ Collecting debug information...");
console.log("๐Ÿ” Running system diagnostics...");
break;
default:
console.log("โ“ Unknown environment: Basic recovery");
break;
}

// Generate recovery delay based on alert level
let recoveryDelay: number = alertLevel == "CRITICAL" ? utility.random(500, 1000) : utility.random(1000, 3000);
console.log(`โณ Waiting ${recoveryDelay}ms before recovery...`);

let recoverySteps: string = alertLevel == "CRITICAL" ? "emergency" : "standard";
console.log(`๐Ÿ› ๏ธ Executing ${recoverySteps} recovery steps...`);

console.log("โœ… Recovery procedures completed");
} else {
console.log("โญ๏ธ Recovery skipped by user");
}
}

// File system health check
let logDir: string = "/var/log";
let logDirName: string = fs.dirname("/var/log/health-check.log");
let logFileName: string = fs.fileName("/var/log/health-check.log");

console.log(`๐Ÿ“ Log directory: ${logDirName}`);
console.log(`๐Ÿ“„ Log file: ${logFileName}`);

// Configuration validation
let configPaths: string[] = ["/etc/nginx/nginx.conf", "/etc/docker/daemon.json"];
for (let configPath: string in configPaths) {
let configDir: string = fs.dirname(configPath);
let configFileName: string = fs.fileName(configPath);
let configExt: string = fs.extension(configPath);

console.log(`โš™๏ธ Config: ${configFileName} (${configExt}) in ${configDir}`);
}

// Generate comprehensive report
let reportData: string[] = [];
array.push(reportData, `Server Health Report`);
array.push(reportData, `===================`);
array.push(reportData, `OS: ${serverOS}`);
array.push(reportData, `Memory: ${currentMemory}%`);
array.push(reportData, `CPU: ${currentCPU}%`);
array.push(reportData, `Issues Found: ${issuesFound ? "Yes" : "No"}`);

let reportContent: string = array.join(reportData, "\n");
fs.writeFile("/tmp/health-report.txt", reportContent);

let executionTime: number = timer.stop();
console.log(`โฑ๏ธ Health check completed in ${executionTime}ms`);

if (issuesFound) {
console.log("โš ๏ธ Health check completed with issues - see logs for details");
exit(1);
} else {
console.log("๐ŸŽ‰ All systems healthy!");
exit(0);
}

Key Features Explainedโ€‹

Environment-Based Configurationโ€‹

The script adapts its behavior based on the detected environment:

let environment: string = "unknown";
switch (serverOS) {
case "linux":
environment = "production";
break;
case "darwin":
environment = "development";
break;
case "windows":
environment = "testing";
break;
}

// Environment-specific thresholds
const memoryThreshold: number = environment == "production" ? 80 : 90;
const cpuThreshold: number = environment == "production" ? 70 : 85;

Privilege Checkingโ€‹

Utah provides built-in privilege detection:

let hasAdminRights: boolean = console.isSudo();
if (!hasAdminRights) {
console.log("โŒ Error: This script requires sudo privileges for system operations");
exit(1);
}

Smart Status Evaluationโ€‹

Using ternary operators for concise status logic:

let memoryStatus: string = currentMemory > 80 ? "CRITICAL" : currentMemory > 60 ? "WARNING" : "OK";
let cpuStatus: string = currentCPU > 90 ? "CRITICAL" : currentCPU > 70 ? "WARNING" : "OK";

Application Health Assessmentโ€‹

Automated checking of critical applications:

const applications: string[] = ["docker", "nginx", "git", "curl"];
let missingApps: string[] = [];

for (let app: string in applications) {
let isInstalled: boolean = os.isInstalled(app);
if (!isInstalled) {
array.push(missingApps, app);
}
}

// Assessment based on missing applications
switch (missingApps.length) {
case 0:
appHealthLevel = "EXCELLENT";
break;
case 1:
appHealthLevel = "GOOD";
break;
default:
appHealthLevel = "POOR";
break;
}

Auto-Recovery Logicโ€‹

Environment-specific recovery procedures:

switch (environment) {
case "production":
console.log("๐Ÿ›ก๏ธ Production mode: Conservative recovery");
console.log("๐Ÿ”„ Graceful service restart...");
break;
case "development":
console.log("๐Ÿš€ Development mode: Aggressive optimization");
console.log("๐Ÿ’พ Clearing all caches...");
break;
case "testing":
console.log("๐Ÿงช Testing mode: Diagnostic recovery");
console.log("๐Ÿ” Running system diagnostics...");
break;
}

File System Operationsโ€‹

Built-in path manipulation functions:

let logDirName: string = fs.dirname("/var/log/health-check.log");
let logFileName: string = fs.fileName("/var/log/health-check.log");
let configExt: string = fs.extension(configPath);

Usage Examplesโ€‹

Basic Health Checkโ€‹

utah compile server-health-check.shx
sudo ./server-health-check.sh

With Environment Variablesโ€‹

export LOG_LEVEL="DEBUG"
export MAX_RETRIES="5"
export ALERT_EMAIL="ops@company.com"
sudo ./server-health-check.sh

Automated Monitoringโ€‹

# Add to crontab for regular monitoring
*/5 * * * * /usr/local/bin/utah run /opt/scripts/server-health-check.shx

Environment Configurationโ€‹

Create a .env file for customization:

LOG_LEVEL=INFO
MAX_RETRIES=3
ALERT_EMAIL=admin@company.com

Output Examplesโ€‹

Healthy Systemโ€‹

๐Ÿ” Starting comprehensive server health check...
โš™๏ธ Configuration loaded - Log Level: INFO, Max Retries: 3
โœ… Running with administrator privileges
๐Ÿ“Š Server OS: linux
๐Ÿ”ง Process ID: 12345
๐Ÿ’พ Memory Usage: 45% (OK)
โšก CPU Usage: 25% (OK)
๐ŸŒ Environment detected: production
โœ… INSTALLED: docker
โœ… INSTALLED: nginx
โœ… INSTALLED: git
โœ… INSTALLED: curl
๐Ÿ“ฆ Application Health: EXCELLENT (0 missing)
โฑ๏ธ Health check completed in 245ms
๐ŸŽ‰ All systems healthy!

System with Issuesโ€‹

๐Ÿ” Starting comprehensive server health check...
๐Ÿ“Š Server OS: linux
๐Ÿ’พ Memory Usage: 85% (CRITICAL)
โšก CPU Usage: 75% (WARNING)
โš ๏ธ HIGH MEMORY USAGE: 85% (threshold: 80%) - CRITICAL
โš ๏ธ HIGH CPU USAGE: 75% (threshold: 70%) - Action: THROTTLE_SERVICES
๐Ÿ”ง Issues detected (CRITICAL). Attempt CONSERVATIVE recovery? (y/n)

Benefits Over Traditional Bashโ€‹

  • Type Safety: All variables are strongly typed with clear interfaces
  • Error Handling: Built-in error handling with script.exitOnError()
  • Environment Detection: Automatic OS and environment detection
  • File Operations: Clean path manipulation without external tools
  • Interactive Prompts: Built-in user interaction functions
  • Comprehensive Logging: Structured logging with timestamps
  • Auto-Recovery: Intelligent recovery procedures based on severity