Set Up Server Uptime Monitoring with Bash and Cron Jobs

What Uptime Monitoring Actually Means and Why It Matters

A website that goes down without anyone noticing is still a problem. The longer an outage lasts, the more it affects visitors, transactions, and trust. Uptime monitoring runs checks against your server and services at regular intervals, alerting you the moment something stops responding correctly.

This differs from general server monitoring with tools like htop and Netdata, which focus on resource usage such as CPU load, memory consumption, and active processes. Uptime monitoring specifically answers the question: is my service reachable and returning the correct response? Both approaches are useful, and many production setups use them together. If you are looking to understand the broader picture of server monitoring, there is a practical guide on setting up htop and Netdata in the technical portfolio.

The core question uptime monitoring answers is straightforward. Your server can be running perfectly fine internally while something prevents visitors from reaching it. Monitoring closes that gap.

Why Waiting for Users to Report Problems Is a Poor Strategy

Most visitors who encounter a broken website do not report it. Some assume the problem is on their end. Some leave immediately and do not return. Some vent on social media rather than reaching out directly. By the time you hear about an outage, it has typically been running long enough to affect a meaningful portion of your audience.

Automated monitoring catches problems that users do not report. Intermittent failures that resolve before a visitor decides to complain, degraded performance that slows pages without making them completely unavailable, and regional issues that affect only certain network providers or geographic areas are all visible through active monitoring but invisible to passive observation.

For a production website, this is not optional. It is a basic operational requirement, much like keeping regular backups or applying security updates. If you are running a business website without monitoring in place, you are relying on chance rather than awareness.

A Simple Cron-Based Uptime Monitor

For a single server or a small number of services, a bash script run by cron provides effective monitoring without installing additional software. The script checks whether each service responds correctly and sends an alert only when something fails.

#!/bin/bash

# uptime_monitor.sh - checks if services are responding

check_service() {
    local url="$1"
    local name="$2"
    local expected_code="${3:-200}"

    response=$(curl -s -o /dev/null -w "%{http_code}" --max-time 10 "$url")

    if [ "$response" != "$expected_code" ]; then
        echo "ALERT: $name (expected $expected_code, got $response)"
        return 1
    else
        echo "OK: $name ($response)"
        return 0
    fi
}

# Check main website and API endpoints
check_service "https://example.com" "Main Website" "200"
check_service "https://api.example.com/health" "API Health Endpoint" "200"
check_service "https://shop.example.com" "Shop" "200"

Run this script every five minutes from cron:

*/5 * * * * /root/scripts/uptime_monitor.sh 2>&1 | grep ALERT | mail -s "Uptime Alert on $(hostname)" admin@example.com

The grep ALERT filter ensures you receive an email only when something is actually wrong. When all services are healthy, the script runs silently and produces no output. This is the key to effective monitoring: quiet when things are working, explicit when they are not.

Adding Retry Logic to Reduce False Positives

Network glitches occasionally cause a single check to fail even when the service is running fine. This creates noise and can desensitise you to real alerts over time. A script that retries before alerting eliminates most transient network issues from triggering unnecessary notifications.

#!/bin/bash

check_service() {
    local url="$1"
    local name="$2"
    local max_attempts=3
    local attempt=1

    while [ $attempt -le $max_attempts ]; do
        http_code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 15 "$url")
        curl_exit=$?

        if [ $curl_exit -eq 0 ] && [ "$http_code" = "200" ]; then
            return 0
        fi

        echo "Retry $attempt/$max_attempts for $name (HTTP $http_code)"
        attempt=$((attempt + 1))
        sleep 5
    done

    echo "ALERT: $name is DOWN after $max_attempts attempts"
    return 1
}

log_status() {
    local status="$1"
    local message="$2"
    echo "$(date '+%Y-%m-%d %H:%M:%S') $message" >> /var/log/uptime_monitor.log
}

# Main checks
if ! check_service "https://example.com" "Main Website"; then
    log_status "DOWN" "Main Website failed"
    mail -s "Main Website Down on $(hostname)" admin@example.com
else
    log_status "UP" "Main Website OK"
fi

The script waits five seconds between attempts. Most network hiccups resolve before the second or third attempt. Only genuine outages trigger an alert, which keeps the signal-to-noise ratio high. Logging both successful and failed checks gives you a historical record you can review later to identify patterns or confirm when an issue was resolved.

Managing Multiple Services with a Configuration File

As the number of monitored services grows, hardcoding URLs directly into the script becomes difficult to maintain. A simple configuration file lets you add, remove, or modify monitored services without touching the script logic.

Create a configuration file at /root/scripts/services.conf:

# Format: URL|EXPECTED_CODE|ALERT_EMAIL
# Skip lines starting with # and empty lines

https://example.com|200|admin@example.com
https://api.example.com/health|200|admin@example.com
https://shop.example.com|200|admin@example.com

The monitoring script reads this file and processes each service:

#!/bin/bash

CONFIG="/root/scripts/services.conf"
LOG_FILE="/var/log/uptime_monitor.log"

log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') $1" >> "$LOG_FILE"
}

send_alert() {
    local service="$1"
    local email="$2"
    mail -s "Alert: $service on $(hostname)" "$email"
}

check_all() {
    local failed=0

    while IFS='|' read -r url expected_code alert_email; do
        # Skip comments and empty lines
        [[ "$url" =~ ^# ]] && continue
        [[ -z "$url" ]] && continue

        http_code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 15 "$url")
        curl_exit=$?

        if [ $curl_exit -ne 0 ] || [ "$http_code" != "$expected_code" ]; then
            log "DOWN: $url (expected $expected_code, got $http_code, curl exit $curl_exit)"
            send_alert "$url" "$alert_email"
            failed=$((failed + 1))
        else
            log "UP: $url"
        fi
    done < "$CONFIG"

    return $failed
}

check_all

Adding a new service means adding a single line to the configuration file. No script changes are required. This separation between configuration and logic makes it straightforward to hand over monitoring to someone else or rebuild the setup on a new server.

Monitoring DNS Resolution Separately

HTTP checks alone do not cover every failure mode. DNS issues can make a site completely unreachable even when the web server is functioning correctly. If DNS resolution fails or points to the wrong IP address, visitors cannot reach your site regardless of how well your web server responds.

check_dns() {
    local domain="$1"
    local expected_ip="${2:-}"

    resolved_ip=$(dig +short "$domain" A | head -1)

    if [ -z "$resolved_ip" ]; then
        echo "ALERT: DNS resolution failed for $domain"
        return 1
    fi

    if [ -n "$expected_ip" ] && [ "$resolved_ip" != "$expected_ip" ]; then
        echo "ALERT: DNS mismatch for $domain (expected $expected_ip, got $resolved_ip)"
        return 1
    fi

    echo "OK: $domain resolves to $resolved_ip"
    return 0
}

check_dns "example.com" "93.184.216.34"

DNS changes are rare compared to web server issues, so running DNS checks every 15 to 30 minutes is usually sufficient. Checking less frequently keeps the monitoring lightweight while still catching DNS problems before they cause extended outages. The optional expected IP parameter lets you verify that your domain still points to the correct server after a hosting change or migration.

Tracking SSL Certificate Expiry

An expired SSL certificate blocks visitors from accessing your site in modern browsers. Rather than discovering an expired certificate when visitors start complaining, monitor expiry proactively and renew before the deadline arrives. This is a common oversight that can be prevented with a simple automated check.

#!/bin/bash

check_ssl_expiry() {
    local domain="$1"
    local warn_days="${2:-30}"

    expiry_date=$(echo | openssl s_client -servername "$domain" -connect "$domain":443 2>/dev/null | \
        openssl x509 -noout -enddate 2>/dev/null | cut -d= -f2)

    if [ -z "$expiry_date" ]; then
        echo "ERROR: Could not retrieve SSL certificate for $domain"
        return 1
    fi

    days_until_expiry=$(echo "( $(date -d "$expiry_date" +%s) - $(date +%s) )" | bc | awk '{print int($1/86400)}')

    if [ "$days_until_expiry" -lt "$warn_days" ]; then
        echo "ALERT: SSL certificate for $domain expires in $days_until_expiry days ($expiry_date)"
        return 1
    else
        echo "OK: $domain SSL certificate valid for $days_until_expiry days"
        return 0
    fi
}

check_ssl_expiry "example.com" 30

Run this script daily via cron. A 30-day warning threshold gives adequate time to investigate renewal issues and complete the process before the certificate lapses. Some certificate authorities offer automated renewal through services like Let's Encrypt with Certbot, which can handle renewals without manual intervention. If you are already using automated renewal, monitoring still serves as a backup check in case the renewal process fails silently.

Including Disk Space in Your Monitoring

Running out of disk space causes services to fail in unpredictable ways. Databases stop accepting writes, log files cannot be created, and applications crash without clear error messages. Adding a disk space check to your monitoring script catches this before it causes a production incident.

check_disk_space() {
    local threshold="${1:-90}"

    usage=$(df / | tail -1 | awk '{print $5}' | tr -d '%')

    if [ "$usage" -gt "$threshold" ]; then
        echo "ALERT: Disk usage at ${usage}% (threshold: ${threshold}%)"
        mail -s "WARNING: $(hostname) disk at ${usage}%" admin@example.com
    else
        echo "OK: Disk usage at ${usage}%"
    fi
}

check_disk_space 90

Set the threshold based on your typical usage patterns. If your server routinely sits at 75% disk usage, a 90% threshold gives early warning. If you typically run at 40%, you can set a lower threshold without triggering false alerts. The key is to know your baseline and choose a threshold that gives you enough time to react before the disk fills completely.

Scheduling Checks at Appropriate Intervals

Different monitoring checks suit different frequencies. HTTP and API health checks should run frequently enough to catch outages quickly, while DNS and SSL checks can run less often since those values change infrequently. The goal is to strike a balance between responsiveness and minimising unnecessary processing.

# /etc/cron.d/uptime-monitoring

# HTTP checks every 5 minutes
*/5 * * * * root /root/scripts/http_monitor.sh >> /var/log/uptime_monitor.log 2>&1

# DNS checks every 30 minutes
*/30 * * * * root /root/scripts/dns_monitor.sh >> /var/log/uptime_monitor.log 2>&1

# Disk checks every 2 hours
0 */2 * * * root /root/scripts/disk_monitor.sh >> /var/log/uptime_monitor.log 2>&1

# SSL expiry check daily at 9am
0 9 * * * root /root/scripts/ssl_expiry_check.sh >> /var/log/uptime_monitor.log 2>&1

Use a dedicated log file and rotate it to prevent it from growing indefinitely. Add this to /etc/logrotate.d/uptime-monitoring:

/var/log/uptime_monitor.log {
    daily
    rotate 7
    compress
    missingok
    notifempty
}

This configuration keeps seven days of logs compressed on disk, which is usually enough to identify when an issue started and what preceded it. Log rotation also prevents the monitoring log from consuming valuable disk space on the very server you are monitoring.

Storing Monitoring Scripts and Configurations Safely

Your monitoring setup is only useful if it survives a server problem. Store your monitoring scripts in a location that is backed up regularly, or better yet, keep them in version control. If you ever need to rebuild or migrate your monitoring setup, having the scripts documented and accessible speeds up the recovery process significantly.

Include a README file with each monitoring script explaining what it checks, what dependencies it requires, and what the expected output looks like. This documentation helps when you return to the setup after several months or when someone else needs to maintain it.

Test your alerting mechanism periodically. Verify that alert emails arrive in your inbox and are not filtered as spam. Temporarily lower a threshold or use a test endpoint to confirm that notifications fire correctly. An untested alert is not a reliable alert.

When to Add a Third-Party Monitoring Service

A self-hosted monitoring script works well for internal checks and small deployments. It has a fundamental limitation, though. If the server running your monitoring script loses network connectivity, it reports all services as down even when they are running fine. This creates a blind spot exactly when you need visibility most.

Third-party monitoring services run checks from multiple geographic locations. They can alert you when your entire server is unreachable, not just when individual services fail. Popular options include UptimeRobot, Pingdom, and HetrixTools, which provide HTTP monitoring, DNS checks, SSL validation, and alerting through email, SMS, and webhooks.

A practical approach is to run your own monitoring for fast, internal checks and use a third-party service as an independent layer that verifies your public-facing services are actually reachable from the outside. This hybrid approach catches issues your internal monitoring cannot see, including complete server connectivity failures. For most small business websites, this layered strategy provides solid coverage without unnecessary complexity.

Keeping Your Monitoring Maintenance in Mind

Monitoring is not a set-it-and-forget-it solution. As your infrastructure changes, your monitoring should change too. Adding a new subdomain, moving an API endpoint, or switching hosting providers all require updates to your monitoring configuration.

Review your monitoring setup periodically to ensure the services being checked are still relevant and the thresholds are appropriate. A check that no longer matches your current architecture provides a false sense of security rather than genuine protection.

If you are managing multiple servers or a growing infrastructure, consider whether a dedicated monitoring platform such as Nagios, Zabbix, or Datadog would reduce the maintenance overhead of multiple custom scripts. These platforms offer centralised management, more sophisticated alerting rules, and visualisation tools that make it easier to understand system health at a glance.

How to Set Up an Uptime Monitor for Your Server

What Uptime Monitoring Actually Means and Why It Matters

Why Waiting for Users to Report Problems Is a Poor Strategy

A Simple Cron-Based Uptime Monitor

Adding Retry Logic to Reduce False Positives

Managing Multiple Services with a Configuration File

Monitoring DNS Resolution Separately

Tracking SSL Certificate Expiry

Including Disk Space in Your Monitoring

Scheduling Checks at Appropriate Intervals

Storing Monitoring Scripts and Configurations Safely

When to Add a Third-Party Monitoring Service

Keeping Your Monitoring Maintenance in Mind

Related practical reading

Frequently Asked Questions

What Uptime Monitoring Actually Means and Why It Matters

Why Waiting for Users to Report Problems Is a Poor Strategy

A Simple Cron-Based Uptime Monitor

Adding Retry Logic to Reduce False Positives

Managing Multiple Services with a Configuration File

Monitoring DNS Resolution Separately

Tracking SSL Certificate Expiry

Including Disk Space in Your Monitoring

Scheduling Checks at Appropriate Intervals

Storing Monitoring Scripts and Configurations Safely

When to Add a Third-Party Monitoring Service

Keeping Your Monitoring Maintenance in Mind

Related practical reading

Frequently Asked Questions

Related Articles

Website Trust Signals That Help Visitors Decide to Contact You

Custom Quote Forms: Which Fields Actually Improve Lead Quality

Key Discovery Questions Before Starting a Business Website Project

Your privacy choices matter.