FinOps for Kubernetes: when "it works" isn't enough

Most Kubernetes clusters that 'just work' are quietly burning money. Average CPU utilization sits at 10%. This article covers where the waste hides, why EU cloud providers change the math, and which tools give you cost visibility without a six-month FinOps program.

Your Kubernetes cluster is running. Deployments roll out, pods get scheduled, alerts are quiet. Everything works. That should be enough, right?

It is not. The CAST AI 2025 Kubernetes Cost Benchmark Report analyzed over 2,100 organizations and found that the average CPU utilization across Kubernetes clusters is 10%. Memory: 23%. Datadog's State of Cloud Costs puts it differently: 83% of container spend is associated with idle resources. And the CNCF's own FinOps microsurvey found that 49% of organizations saw their cloud costs increase after adopting Kubernetes.

The clusters "work." The bills work too. They just work against you.

TL;DR

  • Average Kubernetes CPU utilization is 10%, memory 23%. Most clusters are paying for compute they never touch.
  • The scheduler allocates based on resource requests, not actual usage. Over-generous requests lock capacity nobody uses.
  • EU cloud providers (Hetzner, OVHcloud, STACKIT) cut node costs by 50–70% compared to hyperscalers, but bring trade-offs around managed services and spot availability.
  • Start with VPA in recommendation-only mode and OpenCost for visibility. Both are free, low-risk, and give you data within a week.
  • FinOps is not a tooling exercise. It is a shift in how teams think about resource requests before deployment.

Table of contents

Where the money disappears

The waste is not dramatic. Nobody spins up a 64-core node and walks away. It is cumulative, spread across every deployment in the cluster, and invisible without the right instrumentation.

Three patterns dominate:

Over-provisioned resource requests. Teams copy-paste resource requests from Stack Overflow or set them "high to be safe." The gap between requested and actually used CPUs averages 40% across the industry. That 40% is capacity the scheduler reserves but nobody touches.

Idle nodes and abandoned workloads. Dev and staging clusters spin up for a project, run for months after the project ends, and nobody notices. CronJobs and batch Jobs are particularly wasteful: 60–80% of their allocated resources go unused on average.

Orphaned storage. PersistentVolumes and PVCs that outlive the workloads that created them. Cloud storage bills keep coming whether a pod mounts the volume or not.

The CNCF microsurvey identified the root causes: 70% of respondents blamed overprovisioning, 43% blamed resource sprawl, and 45% blamed a lack of awareness. That last number is the important one. People are not wasting money on purpose. They just cannot see it.

The scheduler is doing exactly what you told it to

This is the mechanism that makes Kubernetes cost waste structural, not accidental.

The Kubernetes scheduler places pods based on resource requests, not on actual utilization. If a container declares requests.cpu: "2000m" but typically uses 200m, the scheduler reserves two full CPUs on a node. Those CPUs are unavailable for other pods, even though 90% of the reservation sits idle.

Worse: pods without any resource requests at all get BestEffort QoS. The scheduler has no utilization data to work with, so bin-packing across nodes is a guess. And under memory pressure, BestEffort pods are the first to be evicted. So "I'll skip requests for now" is both a cost problem and a reliability problem.

Setting requests high seems safe. It guarantees your workload gets what it needs. But at the cluster level, the math is brutal. Ten deployments each over-requesting by 1.5 CPUs adds up to 15 CPUs of phantom capacity. On a hyperscaler, that is roughly $150–200/month in compute you are paying for but not using. On a 20-node cluster, these small over-requests compound into thousands per month.

The fix is not "set lower requests." It is "set informed requests, based on actual usage data." That is where VPA and measurement tools come in (more on those below).

The EU pricing advantage (and its trade-offs)

If you are running Kubernetes in Europe, the choice of cloud provider is the single largest cost lever you have. The pricing gap between EU-native providers and US hyperscalers is not marginal. It is structural.

A concrete comparison for a 4 vCPU / 16 GB dedicated node:

Provider Monthly cost Control plane Egress
Hetzner CCX23 ~€31.49 Self-managed 20 TB/mo free
OVHcloud B2-15 ~€28 Free Free
AWS m6i.xlarge (EKS) ~$133 + $73 CP fee $73/month $0.09/GB
GCP e2-standard-4 (GKE) ~$97 Free (zonal) $0.08/GB

For a 10-node cluster, Hetzner dedicated nodes cost roughly €315/month. The equivalent on AWS EKS: around $744/month, before egress. That is not a rounding error.

But the trade-offs are real:

Hetzner does not offer a managed Kubernetes control plane. You run k3s, kubeadm, or use a third-party managed solution like Syself or Cloudfleet. No spot instances either. The savings come from raw compute pricing and included egress, not from managed service convenience.

OVHcloud does offer a managed Kubernetes service with a free control plane, free inter-service traffic, and reasonable node pricing. It is the more "batteries-included" EU option.

STACKIT (the cloud platform of Schwarz Group, the parent company of Lidl and Kaufland) offers a managed Kubernetes Engine with data centers in Germany and Austria. The sovereignty angle here is real: STACKIT is fully EU-owned and operated, with no US parent company subject to the CLOUD Act. For organizations in regulated industries (finance, healthcare, government), that distinction increasingly matters legally, not just philosophically.

One concrete trade-off of choosing EU-native providers: spot instances are largely unavailable. Hetzner and STACKIT do not offer spot pricing. On AWS, spot instances can save 60–90% on compute. If your workloads tolerate preemption, that is a lever EU providers simply do not have. The base price is lower, but the floor is also the ceiling.

My position: for teams running 3–20 nodes with predictable workloads, the EU pricing advantage outweighs the managed-service trade-offs. For teams with highly variable loads that benefit from spot and Karpenter-style dynamic instance selection, AWS or GCP still makes sense. The right answer depends on your workload shape.

Shift-left: cost awareness before deployment

The FinOps Foundation's State of FinOps 2026 report found that pre-deployment architecture costing was the second most-requested tooling capability among practitioners. The idea is straightforward: if you know what something will cost before you deploy it, you make different decisions.

In Kubernetes, shift-left means three things:

  1. Require resource requests on every workload. Use a LimitRange to inject defaults when teams forget, and a ResourceQuota per namespace to cap total consumption. An admission policy (Kyverno or OPA Gatekeeper) can block deployments that skip requests entirely. If you are already using Kyverno for namespace bootstrapping, adding a "require resource requests" policy is a natural extension. This is table-stakes governance.

  2. Base requests on actual usage, not guesses. Run VPA in recommendation-only mode for 7–14 days, read the recommendations, and adjust. The FinOps Foundation's "Calculating Container Costs" working group recommends using p95 historical usage with 20–30% headroom as the baseline for CPU requests.

  3. Make costs visible per team. Namespace-level cost allocation, even as a read-only dashboard, changes behavior. Only 14% of organizations do full chargeback (moving money between budgets). Most do showback: making costs visible without consequences. That alone is often enough to trigger right-sizing.

At KubeCon EU 2026 in Amsterdam, Anton Weiss pointed out that FinOps does not even appear as a focus area in the platformengineering.org 2025 survey. Platform teams do not see cost optimization as their job. That blind spot is exactly why clusters with no cost visibility keep growing.

Tools to start with today

You do not need a FinOps platform or a six-month maturity program. Two free, open-source tools cover the first 80% of visibility:

OpenCost is a CNCF Incubating project (promoted from Sandbox in October 2024). It provides vendor-neutral, real-time cost allocation by namespace, deployment, pod, and label. It integrates with AWS, GCP, and Azure billing APIs and exports Prometheus metrics. For DIY-minded teams on EU providers, you can feed it custom node pricing. IBM's acquisition of Kubecost in September 2024 validated this space, but OpenCost remains the vendor-neutral starting point.

Fairwinds Goldilocks wraps VPA in recommendation-only mode with a dashboard. Label a namespace with goldilocks.fairwinds.com/enabled: "true", wait a week, and Goldilocks shows you which workloads are over-requesting ("too large"), under-requesting ("too small"), or right-sized ("just right"). It covers Deployments, StatefulSets, and DaemonSets. The gap between "too large" recommendations and current requests is your waste, quantified.

For infrastructure-as-code pipelines, Infracost adds cost estimates to Terraform pull requests. Engineers see the monthly cost delta of every infrastructure change before it merges. It is shift-left at its most literal.

For teams on hyperscalers wanting automated optimization, CAST AI and Karpenter (AWS-native) handle dynamic instance selection and spot management. But these are optimization tools. They work best after you have established visibility. Start with seeing the problem before you automate the fix.

When FinOps is overkill

Not every cluster needs a cost practice. If you are running a 3-node cluster on Hetzner at €95/month total, the potential savings from right-sizing might be €20/month. The time spent instrumenting, analyzing, and adjusting will not pay for itself.

FinOps makes sense when:

  • Your monthly Kubernetes spend exceeds €500–1,000 and you do not know which teams or workloads account for it
  • You are running on hyperscaler pricing where a 30% reduction is meaningful in absolute terms
  • You have multiple teams sharing a cluster with no resource governance
  • Your cluster has grown organically over 12+ months without a resource review

It does not make sense when your cluster is small, your team is small, and you have direct visibility into what runs where. At that scale, picking the right cloud provider (see the EU pricing section above) gives you more savings than any monitoring tool.

Key takeaways

  • The average Kubernetes cluster uses 10% of its provisioned CPU. If that number does not bother you, check your cloud bill.
  • Resource requests drive scheduler behavior. Uninformed requests create structural waste that scales linearly with your cluster.
  • EU cloud providers offer 50–70% lower node costs than hyperscalers. The trade-off is fewer managed services and no spot instances.
  • VPA in recommendation-only mode and OpenCost are free, low-risk starting points. You can have cost visibility within a week.
  • FinOps is not a tooling problem. It starts with making resource requests a deliberate decision, not a copy-paste from a tutorial.

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.