Kubernetes 1.36 quietly buried one of the worst RBAC patterns in production

Fine-grained kubelet API authorization graduated to GA in Kubernetes 1.36. The release notes call it 'more precise access control.' What it actually does is retire nodes/proxy as one of the worst RBAC patterns in production: a single permission that monitoring tools demanded by default, that bypassed audit logging and admission control, and that public research showed could be turned into a node-level RCE with a GET request alone.

Kubernetes 1.36 shipped on April 22 with fine-grained kubelet API authorization graduating to GA. The release notes describe it as "more precise, least-privilege access control over the kubelet's HTTPS API." That undersells it. For the past five years, every Prometheus exporter, every Datadog Agent, every observability platform that needed to read metrics or pod state from a kubelet was granted nodes/proxy permission. That single RBAC verb is also the verb that grants the ability to execute arbitrary commands inside any pod on the node. Half the cluster operators in production never noticed. The other half noticed and shrugged, because the alternative was building custom admission policy.

KEP-2862 finally retires that pattern. This post is what changed, why nodes/proxy was a privilege-escalation primitive in the first place, and what to do with the bindings already in your cluster.

TL;DR

  • nodes/proxy is the RBAC verb every monitoring tool requested. It granted full kubelet API access, including exec into any pod on the node, and bypassed Kubernetes audit logging and admission control.
  • Public research in January 2026 showed nodes/proxy GET alone is enough for a remote-code-execution chain. The Kubernetes Security Team confirmed this is "working as intended" and declined to issue a CVE.
  • KEP-2862 splits the kubelet authorization map into eight fine-grained subresources: nodes/stats, nodes/metrics, nodes/log, nodes/pods, nodes/healthz, nodes/configz, nodes/spec, nodes/checkpoint.
  • The feature gate is locked on in 1.36. Workloads still bound to nodes/proxy keep working via fallback. The migration burden is yours.
  • This kills the demand for nodes/proxy in monitoring stacks. It does not delete the verb. Tooling that needs exec or run still goes through it.

Table of contents

What nodes/proxy actually grants

Every kubelet exposes an HTTPS API on port 10250. That API surfaces metrics, logs, pod state, and the runtime control endpoints (exec, run, attach, port-forward) used by kubectl exec. Until 1.36, the API server's authorization map collapsed almost everything behind that API into a single RBAC subresource: nodes/proxy.

Granting nodes/proxy to a workload means granting access to the kubelet API. That includes:

  • Reading metrics from /stats/* and /metrics/*. This is what monitoring tools wanted.
  • Reading container logs from /logs/*. This is what log shippers wanted.
  • Running an arbitrary command in any container on the node via /exec/*. This is what nobody admitted they were granting.

The Kubernetes documentation has called this out for years. Under RBAC good practices, the project warns that users with nodes/proxy "have rights to the Kubelet API, which allows for command execution on every pod on the node(s) to which they have rights. This access bypasses audit logging and admission control." Read that twice. A workload bound to nodes/proxy does not show up as an exec event in your audit log, and your ValidatingAdmissionPolicy is not consulted before the command runs.

In other words: granting nodes/proxy to a Helm chart was equivalent to giving that chart privileged-container-equivalent reach across every node it could see, with the audit pipeline turned off for the trip.

The GET that was already an RCE

For most of the past five years, the operational consensus was that nodes/proxy with only get was effectively read-only. Many security teams put create, update, and delete behind admission policy and waved through get/list/watch as low-risk. That assumption collapsed in January 2026.

Graham Helton documented that the kubelet authorizes the WebSocket handshake on the initial GET request, not on the actual exec operation that follows. Once the connection is upgraded, the server happily accepts a command= parameter and runs it inside the target container. The exploit is a single request: wss://$NODE_IP:10250/exec/$NS/$POD/$CONTAINER?command=.... Sweet Security published the same finding independently, with a chain that ended in kube-apiserver and etcd access. Nirmata tracked the issue through Kubernetes' HackerOne program. The Kubernetes Security Team's response was that this is "working as intended." No CVE was assigned.

The reception in the security community was harsher. DataDog's Stratus Red Team catalog lists k8s.privilege-escalation.nodes-proxy as a standard adversary technique under MITRE Privilege Escalation. Aqua Security's walk-through demonstrates the full chain from a nodes/proxy-bound ServiceAccount to root inside a target pod. And Stream Security audited public Helm charts and reported nodes/proxy requested by 69 widely-deployed charts, including Prometheus, Grafana Promtail, the Datadog Agent, Elastic Agent, New Relic Infrastructure, the OpenTelemetry Operator, and the Trivy Operator.

The phrase the security industry settled on was the right one: "Many security teams consider GET permissions read-only and safe to grant. This assumption is wrong."

What KEP-2862 actually does

KEP-2862 took a slow path. Alpha in 1.32, beta in 1.33, GA in 1.36. The graduation hardens a KubeletFineGrainedAuthz feature gate that is now locked on in both the kubelet and the kube-apiserver.

The substantive change is the authorization map. Eight new subresources now sit between RBAC and the kubelet API:

Kubelet path New subresource Old catch-all
/stats/* nodes/stats nodes/proxy
/metrics/* nodes/metrics nodes/proxy
/logs/* nodes/log nodes/proxy
/pods, /runningPods/ nodes/pods nodes/proxy
/healthz nodes/healthz nodes/proxy
/configz nodes/configz nodes/proxy
/spec/* nodes/spec nodes/proxy
/checkpoint/* nodes/checkpoint nodes/proxy

A monitoring agent that previously needed nodes/proxy for /metrics now needs nodes/metrics and nothing else. A log shipper needs nodes/log and nothing else. The verbs that retire nodes/proxy from any practical monitoring or observability stack are exactly the verbs those workloads were already using.

What did not get its own subresource: /exec, /run, and /attach. These remain gated by nodes/proxy on purpose. The KEP's strategy is demand reduction, not removal. Workloads that legitimately need to exec into pods (debugging tools, break-glass agents) still ask for nodes/proxy. The point is that those workloads should be a small, audited set, not the entire monitoring fleet.

Backward compatibility is automatic. When the kubelet receives a request, it first issues a SubjectAccessReview against the new fine-grained subresource. If that is denied, it falls back to checking nodes/proxy. Existing ClusterRoles continue to work. The migration is opt-in on your timetable.

Migrating away from nodes/proxy

The 1.36 upgrade itself does not change anything in your bindings. The migration is yours to drive.

Audit first. The fastest way to enumerate workloads that currently lean on nodes/proxy:

kubectl get clusterroles -o json \
  | jq '[.items[]
      | select(.rules[]?
          | .resources[]? == "nodes/proxy")
      | .metadata.name]'

Cross-reference each ClusterRole against the bindings that reference it. Tools like kubectl-who-can or rbac-lookup make this a one-liner: kubectl-who-can get nodes/proxy.

Then replace. A typical Prometheus-style ClusterRole that used to read:

rules:
- apiGroups: [""]
  resources: ["nodes/proxy"]
  verbs: ["get", "list", "watch"]

becomes:

rules:
- apiGroups: [""]
  resources: ["nodes/metrics", "nodes/stats"]
  verbs: ["get", "list", "watch"]

The exact subresources depend on which kubelet endpoints the workload actually hits. For a Datadog Agent or node-exporter equivalent, nodes/metrics and nodes/stats cover the metrics scrape; add nodes/log if you also ship logs. Vendor charts are catching up at different speeds: DataDog opened helm-charts issue #2338 in late January with a tracking PR, and similar work is in flight across the observability ecosystem. Until your vendor ships defaults, you can override the chart's RBAC values yourself.

To stop the drift, lock it down at the cluster boundary. With MutatingAdmissionPolicy reaching GA in 1.36, you can write a ValidatingAdmissionPolicy that flags any new ClusterRoleBinding granting nodes/proxy to a non-allowlisted ServiceAccount. The same pattern works through Kyverno or OPA Gatekeeper if those are already in your stack. I covered the broader admission-policy story in my Kyverno CNCF graduation post.

What this does not fix

The honest section, because this is real progress and not a complete fix.

nodes/proxy still exists as a verb in 1.36, and it is not deprecated. It is also still required for kubectl exec through the API server proxy, which means system:kubelet-api-admin and similar built-in roles continue to hold it. Removing it is not on the roadmap, because removing it would break legitimate cluster-administration workflows.

The migration burden sits with you, not with the upgrade. A 1.36 cluster running unchanged ClusterRoles is exactly as exposed as a 1.35 cluster. The protection only kicks in when those bindings are rewritten. In a multi-tenant cluster the work is bigger; I wrote about the broader question of who owns what in those clusters in my multi-tenant Kubernetes governance post.

And the underlying issue, that authorizing on the GET handshake of a WebSocket upgrade gives the requester the rest of the connection, is a class of bug that the Kubernetes Security Team has explicitly decided not to address as a CVE. The fine-grained subresources mean fewer workloads need that handshake. They do not change what the handshake itself authorizes.

The companion piece to this story is user namespaces graduating to stable in the same release. Both ship in 1.36. Both reduce blast radius for the same class of incident: a compromised workload that previously inherited node-level reach. Together they are the most operationally useful security work the project has shipped in two releases. They are also both opt-in.

Key takeaways

  • nodes/proxy was a privilege-escalation primitive in every cluster you have ever audited. Public research from January 2026 showed it could be turned into a node-level RCE with a single GET request, and the Kubernetes Security Team treats that as expected behavior.
  • Kubernetes 1.36 ships eight fine-grained kubelet subresources (nodes/stats, nodes/metrics, nodes/log, nodes/pods, nodes/healthz, nodes/configz, nodes/spec, nodes/checkpoint) that cover every monitoring and observability use case.
  • Existing nodes/proxy bindings keep working because the kubelet falls back to the old SubjectAccessReview when the new one is denied. The upgrade does nothing for security on its own.
  • Audit your ClusterRoles, replace nodes/proxy with the minimum subresource set, and add an admission policy that flags new nodes/proxy grants going forward.
  • This kills the demand for nodes/proxy from monitoring agents. It does not remove the verb. Workloads that genuinely need exec access still go through it, which is the point.

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.