Your WordPress site shows a blank page with the text "503 Service Unavailable", or Chrome reports "This page isn't working: HTTP ERROR 503". The response inspector shows status code 503. The error appears for visitors and for you, in every browser, on mobile and on desktop. Sometimes it clears on a reload, sometimes it sticks for the duration of a traffic spike.
What a 503 actually means
RFC 9110 §15.6.4 defines 503 as: "the server is currently unable to handle the request due to a temporary overload or scheduled maintenance, which will likely be alleviated after some delay." The key phrase is unable. Not crashed. Not slow. Not broken upstream. The server is alive and willing to talk to you, it just refuses to handle this request right now and expects to be available again later. The same section adds that the server MAY send a Retry-After header field telling the client when to come back.
That definition matters because it tells you what a 503 is not. A 503 is not the application erroring, and it is not the proxy giving up on a slow upstream. It is a deliberate "no" from somewhere in your stack. The job of diagnosis is to find which part of the stack said no, and why.
A 503 is not the same as the other 5xx errors in this category, and the difference matters for diagnosis:
- 503 Service Unavailable: the server is refusing to handle the request right now. Overload, rate limit, or maintenance. This article.
- 502 Bad Gateway: the upstream returned an invalid or empty response, usually because the worker crashed mid-request.
- 504 Gateway Timeout: the upstream was reachable but took too long to answer. PHP is still alive, just slow.
- 500 Internal Server Error: the application itself errored cleanly and returned a 500 to the proxy.
In short: a 502 is a corpse, a 504 is silence, a 500 is an admission of guilt, and a 503 is a "go away".
There is one specific WordPress 503 worth flagging up front. If the response body says exactly "Briefly unavailable for scheduled maintenance. Check back in a minute.", that 503 is coming from WordPress itself during an update, not from your server being overloaded. WordPress's wp_maintenance() sends that message with status 503 and a Retry-After: 600 header while a .maintenance file exists in the site root. The fix for that case is in my article on the WordPress maintenance message. The rest of this article covers the generic server-side 503.
Common causes, ordered by likelihood
1. The server is overloaded (CPU, RAM, or saturated PHP-FPM pool)
This is the cause behind most 503s on a busy site. The PHP-FPM pool ceiling is pm.max_children. Once every worker is busy and the listen backlog is also full, FPM stops accepting new connections cleanly and the front-end web server gives up and serves a 503 to whichever request was unlucky enough to be next. Some hosting control panels return a 503 directly when their per-account CPU or RAM cap is hit, before the request even reaches PHP. Either way, the symptom looks the same to the visitor and the underlying cause is the same: the box does not have the resources to take the request right now. This is the failure mode I describe in detail in my article on PHP workers.
2. nginx (or a WAF) is rate-limiting the client
nginx ships with two rate-limit modules. The ngx_http_limit_req_module limits requests per second per key, and the ngx_http_limit_conn_module limits concurrent connections per key. Both default to returning 503 when the limit is hit (limit_req_status 503 and limit_conn_status 503), which is actually a misconfiguration — see 429 Too Many Requests in WordPress for the correct rate-limit semantics. Many managed hosts ship aggressive defaults to protect the shared server, and a WAF in front of nginx (Wordfence, Cloudflare rate limiting rules, AWS WAF) will do the same. A burst from one IP, a misbehaving uptime monitor, or your own WP-Cron hammering an endpoint can all trip these limits and produce a 503 that targets one client only. The site is fine for everybody else, which is the giveaway.
3. The host or your operator put the site into maintenance mode
A planned maintenance window, a deploy script, or a manual "503 page" that the host serves during emergency intervention will all return 503 on purpose. Some hosts also serve a 503 when they have moved your site or paused it for billing reasons. From the visitor side this looks identical to the overload case, but the cause is intentional and the fix is "wait, or talk to the host". This is also where the WordPress-specific maintenance 503 lives, which I covered above.
4. The application returns 503 on purpose
This one is rare but worth knowing about. A plugin or custom code can call status_header( 503 ) (or similar) during install, on a specific endpoint, or while a long-running job is in progress. WooCommerce plugins sometimes return a 503 during checkout if a payment gateway is unreachable. A staging-mode plugin might respond with 503 to non-allowlisted IPs. The hallmark of this cause is that the 503 is consistent for a specific URL or visitor, and the rest of the site is fine. The nginx access log shows the 503 coming back from PHP-FPM (status 503 upstream), not from nginx itself.
5. The origin behind a CDN is unreachable, and the CDN serves a 503
If you use Cloudflare or another CDN and your origin goes down hard, the CDN cannot reach the upstream and serves its own 503 (or sometimes its own 502, depending on the failure mode). Cloudflare branded errors are easy to spot: the page header will say "Cloudflare". This is less common than causes 1 to 3, but I see it after a deploy that changes the origin IP without updating the DNS or the Cloudflare origin rule, or when an origin firewall starts blocking Cloudflare's IP ranges.
Diagnose which cause applies
These checks are non-destructive. Run them in order before you change a single config value.
Check 1: read the access log. This is the single most important step. Open the access log viewer in your hosting control panel (usually under "Logs", "Access Logs", or "Metrics"). If you have SSH access, the raw file is typically at /var/log/nginx/access.log. For a 503 you want the response code and the upstream_status field side by side. A line that looks like:
"GET / HTTP/1.1" 503 - "..." "..." rt=0.000 uct=- urt=-
with no upstream response time means nginx generated the 503 itself, which is cause #2 (rate limit) or cause #3 (maintenance returned by the front layer). A line like:
"GET / HTTP/1.1" 503 - "..." "..." rt=0.342 uct=0.000 urt=0.341 us=503
with a real urt and us=503 means PHP-FPM (or an app behind it) returned the 503 itself, which is cause #1 (FPM saturated, or WordPress generated the 503) or cause #4 (the app sent it on purpose).
You will know it worked when: you can quote the exact access log line for the failing request and you can match its timestamp to a visitor report, and you have decided whether the 503 came from nginx or from the upstream.
Check 2: read the error log. Most hosting panels have an error log viewer next to the access log viewer. Look there first. If you have SSH access, the raw file is typically at /var/log/nginx/error.log. For a 503 from cause #2 you will see:
limiting requests, excess: 1.234 by zone "one", client: 1.2.3.4, server: yoursite.nl, request: "GET / HTTP/1.1"
That is limit_req rejecting a request. Or:
limiting connections by zone "addr", client: 1.2.3.4, server: yoursite.nl
That is limit_conn. For cause #1 (FPM pool saturated) you will see:
WARNING: [pool www] server reached pm.max_children setting (10), consider raising it
in the PHP-FPM error log (visible in some hosting panels under "PHP logs"; if you have SSH access, check /var/log/php8.3-fpm.log), and the nginx error log will show timeouts or connection failures around the same time. That message and what to do about it lives in my article on PHP workers.
You will know it worked when: you have a log line that names the cause: limiting requests, limiting connections, or pm.max_children. If none of those appear, the 503 is coming from a layer you do not control directly (cause #3 or #5).
Check 3: check resource usage during the incident. Open the resource graphs in your hosting control panel (CPU, RAM, and PHP workers or "processes" if shown). If CPU or RAM is pinned at 100% during the moments the 503 appeared, you are in cause #1. Many managed hosts show a "PHP workers" or "active processes" graph that tells you the same thing without SSH.
If you have SSH access: run ps -ef | grep 'php-fpm: pool' | grep -v grep | wc -l on the server. Compare the number to the pm.max_children value in your pool config (typically /etc/php/8.3/fpm/pool.d/www.conf). If active workers equals pm.max_children, you are in cause #1. If active workers is well below the ceiling, the pool is not the bottleneck and the 503 is not from FPM.
You will know it worked when: you can state whether resources were maxed out or had headroom at the moment of the 503.
Check 4: bypass the CDN. If you use Cloudflare or a similar edge proxy, check whether the 503 page is branded with the CDN's logo or styling. A Cloudflare-branded 503 page is unmistakable and tells you the failure is at the edge, not your origin. You can also try pausing Cloudflare temporarily from the Cloudflare dashboard to see if the site loads directly.
If you have SSH access or a terminal: hit the origin IP directly with curl --resolve yoursite.nl:443:1.2.3.4 https://yoursite.nl/ and time the response. If the origin returns a clean 200 and only the edge URL returns 503, the failure is at the edge: cause #5. If the origin also returns 503, the failure is at your stack and you are back in causes #1 to #4.
You will know it worked when: you can state with certainty whether the failure is at the CDN edge or at your origin server.
Solutions, per cause
Cause #1 fix: the server is overloaded
The right fix is rarely "add more workers". Adding workers to a saturated pool usually moves the symptom from a 503 to a slow site, because the workers are being held by the same slow query, external API call, or runaway plugin. Read the dedicated article on PHP workers for the structural fix. The short version is:
- Find what is holding the workers (PHP-FPM slowlog plus a process snapshot during the incident).
- Cache that page or endpoint at the edge so it does not hit PHP at all.
- Move long jobs out of the request lifecycle into Action Scheduler or WP-CLI.
- Only after those, raise
pm.max_childrenif your CPU and RAM headroom allow it.
If the overload is at the hosting account level (per-account CPU or RAM cap, which the host's panel will tell you), the same logic applies: cache more, do less per request, and only resize the plan once you have proven the workload itself is the right size.
Verification: the active worker count during peak is meaningfully below pm.max_children, the host's resource graphs are not pinned at 100%, and the same URL that 503'd before now returns a 200 in the access log.
Cause #2 fix: a rate limit fired
First, decide whether the limit is doing its job. If the source IP in the error log belongs to a real bot you do not want (an aggressive scraper, a credential-stuffing crawl), the 503 is correct and the fix is to leave it alone and possibly tighten the rule. If the source IP is your own monitor, your own WP-Cron, or a real visitor caught by an overly tight default, the rule is wrong for your traffic.
If your rate limit comes from a Cloudflare WAF rule or another edge-level rate limit, adjust it directly in the Cloudflare dashboard under Security > WAF > Rate limiting rules. If the rate limit comes from your hosting server's nginx configuration, contact your host's support and ask them to raise the limit_req burst allowance for your site, or whitelist the IP that is being blocked. On managed hosting this is almost always a support ticket, because you typically do not have direct access to nginx config files.
If you have SSH access and control over your nginx configuration, raise the limit_req rule by editing the relevant zone:
limit_req_zone $binary_remote_addr zone=one:10m rate=20r/s;
server {
location / {
limit_req zone=one burst=40 nodelay;
# ... rest of the location ...
}
}
rate is the steady allowance, burst is the queue depth, and nodelay lets bursts through immediately instead of trickling them.
Verification: the same client that was getting 503s now gets 200s, the error log no longer contains limiting requests for that source IP, and the bots you actually wanted to block are still being blocked.
Cause #3 fix: the host or operator put the site into maintenance
If you put it there, take it out: remove the maintenance page, end the deploy, or unpause the site. If the host put it there, the answer is in their status page or support ticket and there is nothing for you to fix in WordPress itself. If the maintenance message is the WordPress-specific "Briefly unavailable for scheduled maintenance" string, the fix is in my article on that exact message, because there is a specific .maintenance file to remove from the site root.
Verification: the maintenance page is gone, the response code is no longer 503, and a normal page renders.
Cause #4 fix: the application is returning 503 on purpose
Find which plugin or theme is responsible. Enable WP debug logging by opening wp-config.php through your hosting panel's file manager (or via SFTP) and adding the following lines before the /* That's all, stop editing! */ comment:
define( 'WP_DEBUG', true );
define( 'WP_DEBUG_LOG', true );
define( 'WP_DEBUG_DISPLAY', false );
Reproduce the 503, then download wp-content/debug.log through the file manager and look for warnings around the same time. If nothing useful surfaces, deactivate plugins one at a time from wp-admin under Plugins > Installed Plugins until the 503 stops. If you cannot reach wp-admin because of the 503, use the file manager to rename individual plugin folders under wp-content/plugins/ (e.g., rename problematic-plugin to problematic-plugin.disabled) so WordPress falls back to no plugin. The plugin you just disabled is the one calling status_header( 503 ). Update it, configure it correctly, or replace it. Switching to a default theme like Twenty Twenty-Four via Appearance > Themes is the same trick for the theme layer.
After troubleshooting, set WP_DEBUG back to false in wp-config.php so debug output is not exposed in production.
Verification: the URL that consistently returned 503 now returns a 200, and no plugin in the stack is intentionally signaling 503 for normal traffic.
Cause #5 fix: the CDN cannot reach the origin
If the 503 is branded with the Cloudflare logo (or whichever CDN you use), the failure is at the edge. Confirm with curl --resolve that the origin is healthy. If it is, the CDN is talking to the wrong IP, an origin firewall is dropping CDN traffic, or a Cloudflare origin rule points at a host that no longer exists. Update the DNS A record or the origin rule to the current IP, and confirm the origin firewall allows Cloudflare's IP ranges. If the origin is down, fix the origin first (causes 1 to 4) and the edge will catch up automatically.
Verification: the same URL that 503'd through the edge now returns a 200 through the edge, and the origin access log shows a matching 200 with no upstream errors.
When to escalate
If the steps above do not pinpoint the cause within 30 minutes, hand the incident off to your host or developer. Have these ready, because the first thing they will ask for is exactly this list:
- The exact URL or admin action that triggers the 503.
- The time of the error, with timezone, and whether it is reproducible or only sporadic.
- Your hosting tier and stack (shared, VPS, managed, container; nginx or Apache; PHP version).
- The matching line from the nginx access log, including the upstream status field if present.
- The matching line from the nginx error log, especially anything containing
limiting requests,limiting connections, orpm.max_children. - The active worker count at the time of the 503 versus the configured
pm.max_children. - Whether the 503 is on every URL or only on specific endpoints, and whether it affects every visitor or only specific source IPs.
- Whether the site is behind Cloudflare or another edge proxy, and whether the 503 also happens when you bypass it.
- The list of plugins active on the site, especially any that were updated in the last 48 hours.
Send those in the first message. It saves a full round trip and routes the ticket straight to the right engineer.
How to prevent it from coming back
A persistent 503 is almost always a sizing or rate-limit problem dressed up as a server failure. Three things keep the error rare on a healthy site:
- Cache aggressively at the edge. A request served from a full-page cache never reaches PHP-FPM, so it cannot saturate the pool. Sites that serve 95% of their traffic from cache survive on small pools. Sites where every request is dynamic (membership sites, large WooCommerce stores with logged-in customers) need either bigger pools or shorter requests.
- Tune rate limits to your real traffic. Default
limit_reqrules and Cloudflare WAF defaults are designed to protect a server from anonymous abuse, not from your own monitor or your own cron. Whitelist your monitor's IPs, your office IP, and the IPs of any service that talks to your API legitimately. The trick is to have rate limits that bite bots and never bite you. - Monitor the FPM pool, not just CPU. Hosts that show "all green" on CPU dashboards can still serve 503s because PHP workers are waiting on a database, not on CPU. Track the active worker count and the FPM error log for
pm.max_childrenwarnings. Worker saturation is the real signal, and it shows up before the 503 does.
If a single request to your site can answer in under one second cold, with no edge cache, and the workers return to the pool intact afterward, then a 503 should be impossible during normal operation. Anything else is the system telling you that the workload is too big for the current pool, the rate limit is too tight for the current traffic, or something is intentionally telling visitors to come back later.