Scope: single-site as a starting point
We focus primarily on a single-site WordPress installation on Kubernetes. This is the most common scenario and makes the core problems—state, storage, updates, and lifecycle—most visible. In a single-site setup, one WordPress site is installed per Kubernetes deployment, with its own database and file storage. All attention can go to making this one site highly available and scalable, without complications from shared resources between sites.
Multisite (multiple sites within one WordPress installation) is treated briefly as an edge case. Multisite adds extra complexity on top of the single-site challenges. We will see that multisite has a larger blast radius—one problem in the platform can take down multiple sites at once—and that management (updates, rollbacks, governance) becomes harder. For those reasons we keep multisite out of the main line and discuss it separately later.
WordPress and stateful challenges on Kubernetes
WordPress is inherently a stateful application. It has two core components that must retain data: a MySQL database and a file directory for uploads (media) and extensions (plugins/themes). Traditionally these components run on one server (LAMP stack), but in a Kubernetes environment with ephemeral pods we must explicitly provide persistence. Without measures, data is lost on every pod restart or reschedule.
Persistent storage is therefore crucial. The database must run on a persistent volume or external
service; otherwise posts and settings would disappear on a pod restart. Similarly, media uploads and
uploaded plugins must be kept outside the container (e.g. via a PersistentVolumeClaim) so they
survive restarts. In short, we need storage for both the WordPress files (wp-content) and the MySQL
data.
There's also the question of updates and mutations to the WordPress code itself. In a classic setup a user can install new plugins or themes and run updates via the WP admin. In Kubernetes we usually use immutable containers—the Docker image contains the application code. If we bake WordPress into the image, a problem appears as soon as someone adds a plugin in WP admin: the plugin installer writes only to the filesystem of one container, which is neither persistent nor shared. Other pods don't get that plugin, and a restart wipes the installation. We have two approaches:
- Disallow in-app changes (immutable image): The most controlled approach is to make WordPress
effectively read-only for code. Set
DISALLOW_FILE_MODS=truein wp-config.php, for example, so plugins/themes cannot be uploaded via the admin. All code changes (new plugins, updates) then happen through the CI/CD pipeline: you add the plugin to the code or Helm chart and build a new container image that you deploy. Each change goes through a controlled deployment rather than ad-hoc via the web interface. Important: during upgrades, account for the database—if you update the image to a new WordPress version, ensure any database migrations are compatible with all running pods. - Allow mutations with shared storage: If you do want users to install or upload plugins via the
WP admin, you must make the WordPress filesystem shared and persistent between pods. This
typically means a
PersistentVolumemounted by all WordPress pods (with aReadWriteManyvolume if you run multiple replicas). For example: on AWS you'd use an EFS share mounted by each pod, or in Azure an Azure Files/NetApp Files share. On-premises you can think of NFS, GlusterFS or an existing SAN. Such a shared volume ensures that a plugin upload or media file is available to all replicas at once. This approach introduces new considerations: the storage medium must be reliable and redundant (so no single node or disk failure causes data loss) and it must handle performance. The volume must also handle synchronous updates with concurrent access. Technologies like Portworx or CephRWO/RWXvolumes can facilitate this—Portworx, for example, advertises its shared volumes for scaling WordPress horizontally without losing uploads.
In short, WordPress on Kubernetes requires explicitly placing state outside the ephemeral containers. The common best practice is to use an external database and persistent storage for wp-content. Many experts recommend not hosting the database in Kubernetes itself but using a managed database service to spare you outages, backups, and scalability concerns. The files (uploads/plugins) can be placed on a shared volume or offloaded to object storage with a plugin (e.g. an S3 bucket via WP Offload Media). In both cases, the core WordPress data must be persistent.
Multisite: extra complexity
Although WordPress Multisite can look attractive (multiple sites running on one installation), it adds extra complexity on top of the challenges above. Some points where multisite is harder than separate sites:
- Larger blast radius: All sites share the same codebase and infrastructure. A faulty plugin update or a core update with a bug can take down all sites in the network at once. In a single-site setup such an error would only affect one site; multisite raises the stakes. This means updates in a multisite environment must be approached more carefully and formally—think extensive testing and staged rollouts. Rollback plans must cover rolling back an update across the entire network, not just one site.
- Shared resources & performance: In multisite, all sites share server resources and the database. They use separate tables, but still a single database and often the same PHP runtime. This means one heavy site (e.g. high traffic or inefficient queries) can overload the database or cache and slow down other sites. There is no isolation: a spike on site A can cause latency on site B because they draw from the same pool of PHP workers and DB connections. Extra measures like object caching (Redis/Memcached) are almost required for larger multisites to spread the load.
- Shared state and storage: At the storage level there is also more shared. Uploads are separated per site in a multisite under subdirectories (/uploads/sites/{id}/), but ultimately everything resides on the same volume or bucket. An error or corruption in shared storage can impact all sites. The same goes for an object cache (one Redis instance for all sites) or a shared uploads NFS: there is one source that must work well for all sites.
- Governance & management: Multisite introduces the Super Admin role above site admins, enabling centralized management. This is useful for uniformity, but it requires clear governance: who may install new plugins/themes network-wide? How do you prevent a change for site X from negatively impacting site Y? Often stricter procedures are needed (RACI matrices, change approvals) to avoid chaos. Things like a staging environment per site are also more complex—you can't easily clone a single site from a multisite to staging without copying the entire database and then isolating that one site. Migrating one site out of a multisite into its own installation is notoriously difficult and time-consuming.
In conclusion, WordPress multisite on Kubernetes is even more challenging: all stateful aspects are squared, because each decision affects multiple sites. Unless you specifically need the advantages of multisite (one codebase, shared management for closely related sites, etc.), it is often safer and more flexible to treat sites as separate deployments (possibly with management tools to manage them centrally). In this post we keep multisite as an edge case; most best practices focus on single sites.
Platform choice: managed cloud vs. on-premises Kubernetes
Where you run Kubernetes significantly affects the implementation details of WordPress. We broadly distinguish managed cloud Kubernetes (e.g. AWS EKS, Google GKE, Azure AKS) and on-premises/self- managed clusters (bare metal or your own VMs with Kubernetes).
In cloud environments you benefit from integrated services:
- Storage: Every cloud provides CSI drivers for volumes (think Amazon EBS, Azure Disks, Google
Persistent Disk) that automatically provision
PersistentVolumes. Storage is effectively unlimited with high reliability. You can also often use managed filesystems (such as AWS EFS or Azure Files) for shared storage, or object storage (S3/Blob) for media. This makes it relatively easy to attach aPersistentVolumeClaimto cloud storage without managing your own storage cluster. - Load balancing & networking: In the cloud, Kubernetes can automatically create an external
LoadBalancerresource (e.g. AWS ELB/ALB) when you expose aServiceorIngress. This is "LB as a service" —you don't have to run your own load balancer. Network configuration (subnets, routing) is often partly out of the box or via cloud-managed CNI integrations. Cloud K8s offers one-click SSL (e.g. via AWS ACM) or integration with cloud CDN/DNS services. - Managed databases and caches: You can choose not to run the database in-cluster but use a cloud Managed DB service instead (such as Amazon RDS, Cloud SQL, Azure Database for MySQL). As mentioned earlier this reduces operational headache for backups, upgrades, and failover. The same applies to caching: a managed Redis service can be used instead of a container. With cloud IAM you can also securely manage credentials and access (e.g. a Pod IAM role that has access to a specific database or bucket).
- Upgrades & cluster management: Managed K8s services often handle control plane upgrades for you and provide easy node pool updates. This means less maintenance work and often minimal downtime for Kubernetes version upgrades. The cloud also handles node replacement during hardware issues: if a VM crashes, the autoscaler spins up a new one.
In on-premises or self-hosted Kubernetes, you must provide these aspects yourself:
- Storage: You must implement a storage solution for persistent volumes yourself. This can range from simple NFS servers to advanced SDS (Ceph, GlusterFS) or integrating a SAN. CSI driver maturity can be a factor—not all storage backends are as easy or stable as cloud variants. You'll need to set up provisioning and likely replication carefully to guarantee data availability.
- Load balancing & networking: Without the cloud you must set up your own way to expose the cluster externally. Often people use MetalLB (a load-balancer implementation for bare metal) or route traffic through an external reverse proxy. Setting up a reliable load balancer is more complex on-prem, and network configuration (subnets, VLANs, firewall rules) requires more manual work. There is no "one-click" ingress: you manage the ingress controller (NGINX/Traefik) and the underlying network infrastructure yourself.
- Availability & scalability: In the cloud, datacenters are redundant and you can add nodes relatively easily. On-premises you are limited to your own hardware. The risk of downtime is often higher on-prem, because large cloud providers have redundancy better organized. You must plan for capacity and scale: ensure enough extra compute and storage headroom, because adding a new node can take weeks instead of seconds in the cloud. Hardware failure is your responsibility; you need monitoring to detect a failed server and bring in a replacement.
- Upgrades and management: You perform Kubernetes upgrades yourself, including etcd and the control plane. This requires expertise and carries risk. Cluster upgrades should be well tested, usually via a staging cluster. Backups of etcd, cluster config, and persistent data are your task on-prem; good off-site backups are a must to survive events such as fire or outage. Nothing is "managed"—you need a stronger DevOps team to keep the platform running.
- Integrations: Functionality that comes automatically or as a service in the cloud (monitoring, logging, SSL, IAM) must be built on-prem with open-source tools or enterprise solutions. Think of your own Prometheus/Grafana stack for monitoring, Cert-Manager for SSL certs, Velero or Stash for backups, etc. This is doable, but requires effort and maintenance.
Conclusion: A managed cloud Kubernetes platform removes much of the infrastructure burden, making it simpler to fit WordPress' stateful components. On-prem is certainly possible, but it requires careful design and extra tooling for storage, networking, and reliability. The choice between cloud and on- prem depends on your situation (compliance, cost, expertise). In both cases the fundamental challenges (persistent data, WP scalability, etc.) must be solved, but the means differ.
Tooling for deployment and lifecycle: Helm, GitOps, operators
Helm charts
There are ready-made Helm charts (packages) for WordPress, such as the popular Bitnami/Bitpoke
WordPress chart. With a few commands you can deploy a working WP + database in your cluster. This is
an excellent kickstart—the chart automatically creates Deployments, Services, PVCs, etc. Bitnami's
chart, for example, configures a MariaDB database and a volume for WordPress data out of the box.
However, Helm only gets you to a running installation. The fundamental issues remain: the chart will
create a PersistentVolumeClaim for the /bitnami/wordpress directory, but you must ensure that the
storage class is reliable. Upgrading WordPress via Helm means bumping the chart version or updating
an image tag; but if a user adds a plugin via the admin in the meantime, it lives on the PVC and not
in your Helm manifest. That drift between cluster state and chart is a concern. Helm itself provides
no solution for file-level changes or database migrations—that's up to your procedures. In short,
consider Helm a useful deployment template; it makes installation reproducible and parameterizable,
but you still need to handle Day-2 operations (backups, updates, recoveries) yourself.
GitOps (e.g. Argo CD, Flux)
GitOps brings the principle that your cluster configuration in Git is the source of truth. This works
very well for Kubernetes resources: your Deployment, Service, and Ingress definitions live in a Git
repo, and a tool like ArgoCD ensures the cluster always matches Git. For WordPress on K8s this is
very useful to version your Helm release and configuration, so changes (new env vars, scaling, chart
updates) go through controlled pull requests. The limitation, however, is WordPress' internal state.
Things like database contents or uploaded files can't be versioned in Git—they are runtime data.
GitOps therefore covers the infrastructure side (containers, ConfigMaps, PVC claims), but WP-specific
changes made via the UI are outside Git. If you add a plugin via CI (new image) that's GitOps-
controlled; but if an admin changes a setting in WP Admin or uploads an image, it happens in the
database and PV, not in Git. GitOps may at most signal that a Kubernetes object has changed (e.g. a
PVC status), but it doesn't synchronize that data. In practice you use GitOps for all declarative
aspects (infrastructure as code), but you still need standard WordPress backup/restore procedures for
content. A best practice is to automate/version WordPress configuration (wp-config.php settings,
which plugins are active, etc.) as much as possible (e.g. via environment variables or config files),
so recovery is easier and manual drift is reduced.
Kubernetes operators
An operator is a piece of controller software that brings domain-specific knowledge into the
cluster. For WordPress there are operators (e.g. the open-source Presslabs/Bitpoke WordPress
Operator) that promise to manage WordPress sites as first-class citizens. Such an operator can
introduce a custom resource WordPressSite, after which you only need to declare "I want a WP site
with this name, this many replicas, this version", and the operator handles creating the Deployment,
PVC, Service, database, etc. In theory operators can also automate day-2 tasks such as backups,
coordinating updates, and scaling based on metrics. It sounds like a panacea, but there are caveats:
- Abstraction leaks: Many operators do not cover every aspect. For example, a WP operator can automate app deployment, but the specific storage backend or MySQL tuning is still your concern. An operator removes complexity for standard operations, but complex scenarios or incidents still require knowledge of Kubernetes and WordPress. You remain responsible for performance optimization and troubleshooting when things go wrong.
- Maturity and support: Operators vary in quality and maturity. The Presslabs WordPress operator was introduced a few years ago and offered impressive capabilities, but users noted it was still in a pre-alpha stage for their requirements at the time. Some operators are community-driven with limited support. You are therefore somewhat dependent on the maintainers for updates and bug fixes. If you rely on this for business-critical workloads, assess its viability (number of maintainers, update cadence).
- Lock-in via CRDs: When you use an operator, you introduce Custom Resource Definitions that are specific to that operator. If you build your entire workflow around a WordPress CRD, you're tied to that operator for management. Switching approaches later may require migrating from that CRD to another (or back to "plain" Helm). This isn't insurmountable, but it's something to consider. You lock part of your platform into the logic of a third-party tool.
Nevertheless, operators can be valuable, especially in larger hosting platforms. In practice you often see a combination: for example a WordPress operator in tandem with a MySQL operator (e.g. from Presslabs) so both app and database are managed. Add a Cert-Manager operator for SSL, and so on. This forms an ecosystem of operators that together automate a lot of work. Presslabs (now Bitpoke) assembled a full open-source stack like this. It comes down to investing in a Kubernetes-native way of managing WordPress. If you can make that investment and need the benefits (faster scaling, self-healing, standardization), it's certainly worth exploring. For many teams, however, a simpler setup with Helm charts and good CI/CD pipelines is already enough to run WordPress reliably on Kubernetes without immediately adding the complexity of custom operators.
Recommended baseline setup for WordPress on Kubernetes
In summary, we translate the insights above into a practical standard approach for WordPress on Kubernetes:
- Single-site deployment: Start with one WordPress site per namespace or deployment. This
minimizes overlapping impact. Use a Kubernetes
Deploymentfor the WordPress PHP application so you can scale easily and roll out updates via rolling updates. - 1–3 replicas for web pods: Run WordPress pods in replication (e.g. 2 pods) for high
availability and to distribute read traffic. Set up a Kubernetes
IngressorService(LoadBalancer) to balance traffic across pods. Note that with >1 replica, session affinity is usually not a big issue (WordPress does not use server-side sessions for logged-in users by default), but if you have plugins that use sessions/local cache, consider an external store or sticky sessions. - External database: Use a separate MySQL/MariaDB instance outside the cluster. Ideally this is a managed cloud database service for reliability. If that's not possible, run the database on a dedicated VM or even as a single K8s StatefulSet with a persistent volume—but always ensure a backup strategy. The core point is that DB data must not be lost during cluster events; keeping it external decouples app updates and scaling from the data layer.
- Persistent storage for uploads (or object storage): For media files and other uploads, configure
a
PersistentVolumeClaimbound to a durable storage class. In the cloud you can choose a shared filesystem service (e.g. AWS EFS, Azure Files) so multiple pods can access the files at once. This allows horizontal scaling without duplicating uploads. Alternatively, use a plugin that stores media directly in object storage (S3, GCS) so WordPress pods don't need a shared filesystem for media, which simplifies the architecture. In both cases, wp-content/uploads remains durably stored outside the container. - Immutable application image: Build a Docker image with WordPress and required plugins/themes
preinstalled (via Composer or WP-CLI during build, for example). Enforce a policy that containers
are not writable for code (set
DISALLOW_FILE_MODS). Plugin updates then go through a new image build + deploy. This prevents configuration drift and makes rollback easier (you know exactly which code runs in a given image). - Helm and GitOps for deployments: Use a Helm chart (if needed, the Bitnami WordPress chart as a base) to declaratively define all Kubernetes resources. Parameterize items like database credentials, storage class name, replica count, etc. Commit the Helm release to Git (e.g. values YAML). Then use a GitOps tool like Argo CD/Flux to keep the cluster consistent with Git. This makes changes (e.g. a WordPress version update via image tag) traceable and controlled.
- Caching and performance: Consider a Redis (or Memcached) deployment for object caching, especially if the site runs many DB queries. This can run inside Kubernetes or externally. A caching layer benefits all replicas and is almost required for multisite or heavy sites. Likewise, a CDN for static assets is recommended in production to reduce load on the pods.
- Monitoring & backups: Implement monitoring at both the Kubernetes cluster level (node/pod metrics) and at the WordPress level (uptime and performance checks). Tools like Prometheus/Grafana can be integrated (some Helm charts have hooks for Prometheus metrics). Ensure backups are set up: database backups (automated snapshots or export via tools like mysqldump or a MySQL operator's backup feature) and file backups (e.g. sync the uploads PV to cloud storage or use a backup operator like Stash). Test restore procedures to ensure you can recover a site during an incident.
- Be cautious with multisite: If you still want to run WordPress Multisite on Kubernetes, apply all of the above even more strictly. Use one shared DB with backups, RWX storage for everything under /wp-content/uploads/sites/, and intensive testing for updates. Keep in mind that scalability benefits are limited—ultimately it's still one WP installation you scale, not multiple independent apps. Multisite on K8s is most useful when sites are closely related and you value central control over maximum isolation.
With this approach we have a robust baseline: we combine the best of Kubernetes (self-healing, scalability, declarative config) with respect for the nature of WordPress (stateful data stored externally/persistently). The result is a WordPress setup that works "normally" for end users, while behind the scenes you benefit from container orchestration. As long as you keep the limitations (complexity, cost) in mind and choose the right tools, WordPress on Kubernetes can be a successful and manageable choice for modern hosting needs.