Status page methodology — Docs

What we measure

We track six independently-monitored services:

Gateway — the front door at gw.justproxies.online. Authentication, routing, TLS termination for control-plane connections.
Residential pool — successful exit through a randomly selected residential IP at the moment of probe.
Mobile pool — same, against the rotating mobile pool.
Datacenter (rotating) — high-throughput pool.
Datacenter (static) — sticky datacenter IPs.
Dashboard & API — the customer-facing dashboard and the public REST endpoints.

Each service is treated as an independent unit of availability. The fleet-wide uptime number on /status is the arithmetic mean of the six per-service numbers — same metric, different aggregation.

Probe network

Probes run from a small set of geographically distributed monitoring nodes. Each node is independently provisioned and on a different network from the production fleet — a probe failing because production is down cannot also take the probe down.

One node per region: North America (East), North America (West), Europe, Middle East, Asia-Pacific, South America.
Each node fires every service every 30 seconds.
A service is considered down at this instant only if two or more probe nodes fail it inside the same 30-second window. A single probe failing is treated as the probe's problem, not the service's.

Success criteria

Per probe:

Connection — TCP handshake completes within 5 s.
TLS — handshake completes within 5 s. Certificate is valid and chains to a public root.
Application — receive a 2xx response (or, for the pool probes, a successful exit through a real IP) within 15 s end-to-end.

Anything that misses any of those bars is recorded as a probe failure. Probe failures roll up into the daily availability number described below.

Daily roll-ups

Each calendar day produces one tile per service:

Operational (green) — at least 99.95% of probes for this service that day succeeded.
Degraded (orange) — between 98.0% and 99.95% probe success.
Down (red) — below 98.0% probe success.

The uptime percentage shown next to each service is the mean probe-success ratio across the trailing 30 days. The latency callout (when present) is the P50 probe round-trip for the day, after stripping the slowest 1% of samples.

Incident threshold

We surface an entry in Recent incidents for any service-day that fell into the orange or red band, with:

The service affected and the calendar date.
The duration in minutes — the time from first probe-failure to the point where probe-success ratio recovered above the operational threshold.
A one-line root-cause summary written by the operator who handled the event.

Routine maintenance windows and customer-side issues (target rate limits, the customer's own egress problems, AUP-flagged accounts) are not counted as incidents. The page is an honest availability log, not a blame log.

Definitions

Probe — a single end-to-end check against one service from one monitoring node.
Probe-success ratio — successful probes divided by total probes attempted, in a defined window.
P50 — median round-trip across all successful probes in a window.
30-day uptime — mean probe-success ratio across the trailing 30 calendar days, weighted equally per day.
Resolved incident — an incident whose probe-success ratio has been above the operational threshold for at least the last full hour.

Subscriptions

Email [email protected] to be added to the change-of-state alert list. We send one email per state transition (operational → degraded, degraded → operational), not one per probe.