DNS Timeouts (Slow or Failing Resolution)
The scenario (Why you care)
“The page just spins forever.” When DNS is slow or silently failing, users rarely say “DNS is broken.” They say that websites hang on the first load, some applications never reach their servers, or VPN logins stall before showing any login screen. In many environments, a DNS timeout is the first failure the user feels.
From the host’s perspective, a DNS timeout means it sent a query and waited for an answer that never arrived. Many resolvers retry the same query, or try a secondary resolver, before finally giving up. This creates long gaps in application behavior: 3–10 seconds of waiting before anything else can happen. In packet captures, that delay is visible and measurable—if you know where to look and how to distinguish a true timeout from a normal “no such name” (NXDOMAIN) or an application that is simply slow.
This topic focuses on how to read captures when “DNS looks slow” or “name resolution sometimes never completes” and how to decide whether the issue is the client, the resolver, or the network between them.
What “good” looks like (Success pattern)
- What you see
- For each application access, you see one or more DNS queries followed very quickly by matching responses (usually within a few milliseconds to tens of milliseconds). Transaction IDs match, response codes are “NoError”, and the client moves on immediately to TCP/TLS connections. Repeated visits often show fewer DNS queries because of caching.
- Why it happens
- The client can reach at least one healthy recursive resolver, that resolver can reach the authoritative servers, and the entire path for queries and responses is free of blocking or asymmetric routing. Caching at multiple layers (OS, browser, resolver) reduces the number of queries needed after the first lookup.
- Key clue
- In the capture, the time delta between DNS query and response is small and consistent, and you see the client immediately open TCP connections to the returned IP addresses without long idle gaps.
What goes wrong (Failure pattern)
- What you see
- DNS queries go out, but responses do not come back in a timely way. You may see the same query retransmitted several times to the same resolver or to multiple resolvers configured on the client. Eventually the client gives up, reports a timeout, or falls back to cached data. In other cases, some names resolve instantly while others for the same application hang for seconds or never complete.
- Likely causes
- Typical causes include: the primary resolver being unreachable or overloaded, a firewall or ACL blocking UDP/53 or large responses, broken anycast routing to a public resolver, middleboxes mangling EDNS0 or DNSSEC-related responses, or misconfigured clients pointing at a non-existent DNS server. Sometimes only specific domains are affected because an upstream forwarder or authoritative server is slow or unreachable.
- Key clue
- The defining sign of a DNS timeout problem is a long gap between query and any usable response—often multiple seconds— combined with repeated queries for the same name and no DNS error code such as NXDOMAIN or SERVFAIL that would allow the application to fail fast.
Signals & decision table
This decision table helps you separate “DNS timeout” from “normal negative answer” and pinpoint whether the problem is at the client, resolver, or network layer.
| Signal you see | What it suggests | What to check next | Related field/protocol |
|---|---|---|---|
| DNS query sent, no response, repeated several times | Resolver unreachable or replies blocked | Verify if any response from that resolver IP appears; check UDP/53 reachability and firewall rules | DNS header (Transaction ID), UDP/53 |
| First DNS lookup slow, later lookups fast for same name | Upstream resolver or authoritative path slow, caching hides problem after first hit | Compare RTT for cache miss vs hit; inspect which server the slow query targets | DNS flags (RD/RA), response TTL |
| NXDOMAIN or SERVFAIL returned quickly (no long gaps) | Not a timeout; domain truly does not resolve or is misconfigured | Confirm RCODE and question name; focus on DNS configuration, not connectivity | DNS RCODE, question section |
| Client queries multiple resolvers in sequence before success | Primary resolver slow or down; secondary works | Check health and latency of each resolver; ensure client resolver order is intentional | DNS server IPs, query timing |
| Only large responses (many records, DNSSEC, EDNS0) seem affected | Truncation or UDP size issues causing retries or drops | Look for TC flag and TCP fallback; examine EDNS0 advertised size and firewalls on UDP fragments | DNS flags (TC), EDNS0, TCP/53 |
| DNS looks fine, but application still waits before connect | Problem likely above DNS (TLS handshake or server latency) | Correlate DNS finish time with first TCP SYN to target IP | DNS vs TCP timeline |
How it looks in Wireshark
Display filter example:
dns && dns.flags.response == 0
- Focus on queries first: with the filter above you see outbound DNS requests and can measure how long it takes before the matching response (same Transaction ID, source/destination swapped) appears.
- Timeouts show up as queries with no matching response at all, followed by identical retransmissions or queries to a different DNS server IP. The gaps between these attempts correspond directly to the user’s perceived delay.
- By adding columns for “DNS response time” or by following a single Transaction ID, you can quickly distinguish consistent low-latency lookups (healthy) from occasional very slow or missing responses (suspect path or resolver).
Quick read tip: When an application “hangs on lookup,” align the DNS timeline with the moment the user clicked. If you see several seconds of repeated DNS queries with no answer before any TCP connection attempts, you are dealing with a DNS timeout, not a slow web server.
Fast triage checklist
- Reproduce the issue while capturing on the client side. Note the exact time when the user triggered the failing action (for example opening a URL or starting a VPN).
- Filter on DNS and identify the first query for the name in question. Measure how long it takes until a response or until the client gives up and tries another resolver or another name.
- Classify the outcome: fast positive answer, fast NXDOMAIN/SERVFAIL, or repeated queries with no answer. Only the last pattern is a true timeout.
- If timeouts are confirmed, capture closer to the resolver or along the path to see whether responses are generated but lost in transit, or if the resolver itself is not replying.
- Decide where the permanent fix belongs: client settings (wrong DNS IPs, bad resolver order), resolver capacity or configuration, or network policy (firewalls, rate limits, UDP/fragment handling) between client and resolver.
Common pitfalls
Confusing NXDOMAIN with a timeout
A quick NXDOMAIN is a clear, useful answer—even if the user sees an error page. It means the DNS machinery worked, but the name itself is invalid or misconfigured. Treating NXDOMAIN as “DNS timeout” leads you to chase network ghosts instead of fixing host files, zone records, or application configuration that references the wrong name.
Only capturing above the resolver
If you capture on the far side of a recursive resolver (for example at an internet edge), you may only see the resolver’s traffic to authoritative servers, not the original client queries. When troubleshooting timeouts, you need a view that includes the client-to-resolver leg; otherwise, you cannot tell whether the resolver never got the query, never answered, or answered but the client never received the response.
Ignoring multiple resolvers and split-horizon setups
Many clients use more than one DNS server, and many enterprises use split-horizon DNS with different answers internally and externally. If you only look at one resolver’s traffic, you can miss that the client is quietly failing over to another server or getting conflicting answers from different views. Always check which resolver IP each query is sent to and whether timeouts occur only when a specific resolver is involved.