Why your uptime monitor says your WireGuard server is up (when it's actually broken)
This is the single most common failure mode I see when teams set up VPN monitoring for the first time. Their uptime dashboard shows a healthy server. Users are in Slack reporting broken tunnels. The monitor keeps saying green. This post is about why that happens, and what to do about it.
If you run WireGuard in production and your monitoring setup ends at a TCP or UDP port check, you have this bug. You just haven't been bitten by it yet.
The short version
WireGuard is a stateless, authenticated UDP protocol. The server accepts packets on a single UDP port, 51820 by default, and responds only to packets that present a valid Curve25519 handshake. If a packet doesn't decrypt correctly, the server drops it silently. No TCP reset, no ICMP unreachable, no log line (unless you've turned on debug logging).
That means a UDP port probe, the kind almost every generic uptime monitor does, is useless as a health check. The WireGuard kernel module will receive your probe, attempt to decrypt it as a handshake, fail, and drop it on the floor. From the outside, it looks like the port is open. From the inside, nothing useful happened.
Meanwhile, a real WireGuard client might be getting the exact same silent treatment. A key rotation that wasn't propagated, an expired preshared key, a routing misconfig, a DDoS rate-limiter kicking in, or twenty other things. The port is still "open". Your monitor still says green.
What a WireGuard handshake actually involves
WireGuard's handshake is a variant of the Noise protocol framework. The simplified flow for a client connecting to a server:
- Client sends
handshake_initiation(148 bytes) containing its ephemeral public key, its static public key (encrypted with the server's known static public key), and a MAC. - Server verifies the MAC, decrypts the static key, looks it up in its list of configured peers. If found, continues. If not, silently drops.
- Server sends
handshake_response(92 bytes) containing its own ephemeral public key plus confirmation material. - Client verifies. Now both sides have derived session keys.
- Data packets flow, both sides rotating keys every 120 seconds or 260 messages.
The critical observation: the server's response in step 3 only happens if the client presented a valid initiation in step 1. A random UDP probe does not. The server treats it as noise and drops it. No response goes back to the prober.
So what does a port monitor actually observe when it sends a UDP probe to a WireGuard server?
Usually, nothing. No response. Most UDP port monitors interpret "no response" as "port is open" (UDP is connectionless, so there's no positive "accepted" signal like SYN-ACK in TCP). The server could be fully broken; the monitor still sees green.
A small number of monitors do ICMP-unreachable detection: if the server's firewall is configured to send ICMP port unreachable for closed UDP ports, a port monitor can distinguish open from closed. But most WireGuard deployments don't send ICMP unreachable, they just drop, because doing so leaks information to port scanners. So this signal is usually unavailable too.
Failure modes port monitors miss
Here's a non-exhaustive list of ways your WireGuard server can be totally broken while a port monitor reports green:
1. Key rotation not propagated
You rotated the server's static private key (via wg genkey), pushed new PublicKey values to all clients via config management. One fleet, say the mobile clients still pinned to the old config, has the old public key. Those clients can no longer handshake. Server sees their packets as invalid, drops them. Port is still open.
2. Preshared key expired or mismatched
If you use PresharedKey (recommended for post-quantum resistance), and it's out of sync between the server's peer config and the client's, handshakes fail. Silent drop. Port still open.
3. AllowedIPs misconfigured
The server accepts handshakes but routing is broken. AllowedIPs on the server's peer config doesn't include the client's tunnel IP. Handshake succeeds; data packets from the client get dropped by the WireGuard cryptokey routing layer. From the client's perspective, the tunnel is "up" but nothing works. From outside, port monitor is green.
4. Kernel module loaded but process state broken
systemd says wg-quick@wg0 is active. ip link show wg0 shows the interface. But the last peer was removed via wg set wg0 peer <key> remove and the config on disk got out of sync. Server will accept new peers if configured but existing handshakes from those removed peers get silently dropped. Still green.
5. DDoS rate-limiter burning legitimate handshakes
WireGuard has built-in cookie-based DoS protection. Under sustained handshake floods, the server starts replying with cookie challenges instead of handshake responses. Clients that don't handle cookies correctly (some older userspace implementations) fail. The server is "up" in every technical sense; specific clients just can't connect. Port monitor green.
6. The interface is up but the process is pinned on an old CPU core with a bug
Rare but seen in the wild: wireguard-go userspace implementation, one core hanging on a spin lock, interface appears active, no packets processed. Takes a restart to fix. Port monitor still green (the socket is still bound).
How to actually test a WireGuard server
The honest answer is: you perform the real handshake. There's no shortcut. Here are three ways to do it, in increasing order of convenience.
Option 1: use wg itself
If you have a client config for the server, stand up a peer, try a handshake, check the result:
# on a check host with a monitoring-only peer key
sudo wg-quick up wg-check
# wait a couple of seconds, then check
sudo wg show wg-check latest-handshakes
# expected: unix timestamp of recent handshake
# failure: "0" (epoch, never handshook)
You can script this. Handshake happens on first data packet, so a quick ping 10.0.0.1 across the tunnel forces it. If latest-handshakes updates, the server is alive. If it sits at 0 for more than 10 seconds, something's broken.
Downsides: you need a real peer provisioned just for monitoring, you need wg-quick on the check host (usually root), and you can't easily do this from lots of regions.
Option 2: roll your own handshake initiation
The WireGuard handshake is well-documented and small. You can build an initiation packet in Python:
# pip install pynacl
import nacl.bindings as nb
import os, struct, time, socket
MSG_INITIATION = 1
def build_handshake_initiation(server_pubkey: bytes,
client_privkey: bytes,
client_pubkey: bytes,
sender_index: int) -> bytes:
# Simplified. Real implementation needs Noise_IK state machine,
# TAI64N timestamp, MAC1 + MAC2 computation over the right byte ranges.
# See github.com/WireGuard/wireguard-go/device/noise-protocol.go for
# a reference you can port.
...
# then:
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.settimeout(2.0)
sock.sendto(build_handshake_initiation(...), ("203.0.113.1", 51820))
try:
data, _ = sock.recvfrom(4096)
# First byte of a handshake_response is message type = 2
if data[0] == 2 and len(data) == 92:
print("healthy")
else:
print(f"unexpected response: type={data[0]}, len={len(data)}")
except socket.timeout:
print("no response, probably broken")
This works but is a non-trivial amount of crypto code to get right. The MAC1/MAC2 computation is where most hand-rolled implementations get it wrong. Worth it if you have a strong reason to avoid adding a tool dependency.
Option 3: use a tool that already does this
This is the pragmatic option. Several tools exist:
wireguard-goitself can be scripted to do the handshake and report.- nmap with the
wireguardNSE script can probe, though coverage is limited. - Or, full disclosure, TunnelHQ does this continuously from distributed check nodes and alerts when the handshake fails. (You can try it free with 5 monitors, or use the one-shot tester without an account for URI-based protocols. WireGuard needs an account because it handles your private key.)
What about HTTPS-wrapping, Tailscale, headscale?
Some WireGuard deployments are wrapped in additional layers: Tailscale coordinates peer config via a control plane, headscale provides a self-hosted version, Innernet routes through a central coordinator. For monitoring these setups, the control plane has its own health signals, usually a REST API or a status endpoint.
That's great, but it doesn't substitute for end-to-end handshake checks. The control plane might report all peers connected while the WireGuard data plane between two peers is broken by a rogue firewall rule. Monitor both: the control plane for coordination health, the data plane for actual tunnel health.
The bigger lesson
This isn't really about WireGuard specifically. The pattern generalizes:
If a protocol silently drops invalid packets (which any well-designed authenticated protocol does, to resist scanning) then a port probe will tell you approximately nothing about whether the protocol is working.
OpenVPN mostly does the same thing. So does IKEv2, VLESS, VMess, Shadowsocks, Trojan. So do most modern VPN and proxy protocols. Port probes only work for things that respond to every TCP connection or UDP packet, regardless of content. That's essentially HTTP, SSH, SMTP, and similar server protocols where the protocol literally announces itself on connect.
For VPN monitoring, for any authenticated protocol monitoring, your choices are:
- Perform the actual protocol handshake (correct, more work).
- Have the service itself emit a heartbeat to an external monitor like Healthchecks.io (works but only tells you the process is running, not that clients can connect).
- Use synthetic user monitoring: a real client connects end-to-end on schedule (closest to what users experience, most work).
Pick any of those. Just don't rely on port probes.
If you want this handled for you
TunnelHQ performs real WireGuard handshakes against your servers every 1 to 10 minutes from check nodes in US, EU, APAC, and SA. When a handshake fails, an alert hits Slack, email, Telegram, Discord, or a webhook within a second. Free for 5 monitors, no credit card.
Start free or read the WireGuard monitoring page