Uptime monitoring runs automated checks — usually HTTP requests, sometimes DNS or TCP probes — against a site at fixed intervals such as every 60 seconds. If the check fails or returns an unexpected status, the monitor opens an alert so the operator can investigate before customers do. Synthetic checks complement RUM, which only fires when real visitors are present.
Why it matters
Undetected downtime costs revenue, search ranking, and user trust. Crawlers that hit repeated 5xx responses will deprioritise the URL until it returns clean again. A "99.9% uptime" SLA still permits roughly 8.7 hours of outage per year, and most of that budget gets spent in a handful of long incidents. Catching an outage in the first minute is the difference between a status-page note and a postmortem.
How to set it up
- Probe from multiple geographic regions so a single ISP issue does not page everyone.
- Validate response bodies, not just HTTP status codes — a "maintenance mode" page often returns a clean
200. - Track TTFB on the same checks; latency creep is an early warning before hard failures.
- Monitor DNS resolution and TLS certificate expiry separately — these fail differently from origin issues.
- Use 2-of-3 confirmations before paging a human to avoid alert fatigue from transient blips.
- Publish a public status page so customers can self-serve incident updates.