Kubernetes TLS with cert-manager: automated certificate management

cert-manager automates TLS certificate issuance and renewal on Kubernetes using Let's Encrypt. This tutorial walks through every step: installing cert-manager via Helm, creating ClusterIssuers for staging and production, configuring HTTP-01 and DNS-01 challenges, issuing certificates for Ingress and Gateway API resources, and monitoring certificate expiry with Prometheus.

Table of contents

Learning goal

By the end of this tutorial you will have cert-manager running on a Kubernetes cluster, a staging and production ClusterIssuer backed by Let's Encrypt, at least one TLS certificate automatically issued and renewed, and Prometheus alerts watching for expiry. You will understand when to pick HTTP-01 over DNS-01, when to use Ingress annotations versus an explicit Certificate resource, and how to debug the entire ACME lifecycle.

Prerequisites

  • A Kubernetes cluster running v1.32 or later (cert-manager v1.20 supports Kubernetes 1.32 through 1.35)
  • kubectl configured with permissions to create namespaces, CRDs, and cluster-scoped resources
  • Helm 3.x installed locally
  • A publicly reachable domain name (for HTTP-01 challenges). DNS-01 works without public access but needs API credentials for your DNS provider.
  • Familiarity with Kubernetes Ingress resources or Gateway API. Both articles cover the routing layer that cert-manager integrates with; this tutorial focuses on the certificate automation layer.

What cert-manager does

cert-manager is a Kubernetes controller that adds Certificate, Issuer, ClusterIssuer, CertificateRequest, Order, and Challenge as Custom Resource Definitions. It watches these resources, talks to certificate authorities (Let's Encrypt, HashiCorp Vault, internal CAs, and others), and stores signed certificates in Kubernetes Secrets.

The lifecycle for Let's Encrypt looks like this:

  1. You (or an Ingress annotation) create a Certificate resource.
  2. cert-manager creates a CertificateRequest, then an ACME Order.
  3. For each domain in the order, cert-manager creates a Challenge (HTTP-01 or DNS-01).
  4. Let's Encrypt validates the challenge and issues the certificate.
  5. cert-manager stores tls.crt and tls.key in the target Secret.
  6. At roughly two-thirds of the certificate's 90-day lifetime (around day 60), cert-manager renews automatically.

That last point is the entire value proposition. Manually renewing Let's Encrypt certificates every 90 days across dozens of services is the kind of operational work that gets forgotten until something breaks at 2 AM.

Installing cert-manager with Helm

cert-manager v1.20 (released March 10, 2026) is the current stable release. Install it from the OCI registry at quay.io/jetstack:

helm install cert-manager oci://quay.io/jetstack/charts/cert-manager \
  --version v1.20.0 \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true

The flags that matter:

  • --version v1.20.0 pins the chart version. Without it, Helm pulls the latest, which can surprise you during a production change window.
  • --set crds.enabled=true installs cert-manager's CRDs as part of the Helm release. The CRDs are intentionally retained on uninstall since v1.15 to prevent accidental data loss.

Checkpoint. Verify that the three cert-manager pods are running:

kubectl get pods -n cert-manager

Expected output (names will differ):

NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-6d4b6d6c96-xk7gn             1/1     Running   0          42s
cert-manager-cainjector-74fb68c89b-2j4qp   1/1     Running   0          42s
cert-manager-webhook-5c8f4b6d67-rl9tn      1/1     Running   0          42s

If you plan to use Gateway API instead of or in addition to Ingress, add --set config.enableGatewayAPI=true. Gateway API CRDs (v1.4.1 or later) must already be installed in the cluster before you enable this flag.

Never embed cert-manager as a sub-chart of another Helm chart. cert-manager manages cluster-scoped resources that conflict with Helm's ownership model, and the project explicitly warns against it.

Creating ClusterIssuers for Let's Encrypt

An Issuer is namespace-scoped; a ClusterIssuer is cluster-scoped. For Let's Encrypt, use ClusterIssuer so that any namespace can request certificates without duplicating issuer configuration.

Always create two: one for staging, one for production. The staging environment has relaxed rate limits and issues certificates from an untrusted root ("(STAGING) Fake LE Root X1"). Browsers will warn, but the ACME flow is identical to production. This is where you catch misconfigurations.

# staging-clusterissuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    email: devops@yourcompany.nl          # Let's Encrypt sends expiry warnings here
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-staging-account-key
    solvers:
    - http01:
        ingress:
          ingressClassName: nginx
# prod-clusterissuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: devops@yourcompany.nl
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
    - http01:
        ingress:
          ingressClassName: nginx

Apply both:

kubectl apply -f staging-clusterissuer.yaml
kubectl apply -f prod-clusterissuer.yaml

Checkpoint. Verify both issuers are ready:

kubectl get clusterissuer

Expected output:

NAME                  READY   AGE
letsencrypt-staging   True    15s
letsencrypt-prod      True    10s

If READY is False, inspect the events: kubectl describe clusterissuer letsencrypt-staging. The most common cause is a network policy blocking outbound HTTPS to acme-v02.api.letsencrypt.org.

The privateKeySecretRef Secret is created automatically on first registration. Losing it means you cannot revoke previously issued certificates (you can still issue new ones).

HTTP-01 challenge: the simple path

HTTP-01 is the default solver in the ClusterIssuers above. cert-manager provisions a temporary Pod and Ingress (or HTTPRoute) that serves a token at http://<domain>/.well-known/acme-challenge/<token>. Let's Encrypt fetches this over plain HTTP on port 80 to verify domain ownership.

When HTTP-01 works well:

  • Your domain resolves to the cluster's ingress controller
  • Port 80 is reachable from the internet (Let's Encrypt does not follow redirects from HTTP to HTTPS for challenge validation)
  • You do not need wildcard certificates

When HTTP-01 does not work:

  • Internal or private clusters with no public ingress
  • Wildcard certificates (*.yourcompany.nl). DNS-01 is the only ACME challenge type that validates wildcard ownership.
  • Port 80 is blocked by a firewall or cloud security group

One common mistake: not specifying ingressClassName in the solver. Without it, cert-manager creates challenge Ingress resources without a class annotation, which means every ingress controller in the cluster serves the challenge traffic. On clusters with multiple controllers, that causes unexpected load balancer creation and routing conflicts.

DNS-01 challenge: wildcards and private clusters

DNS-01 proves domain ownership by creating a TXT record at _acme-challenge.<domain>. Let's Encrypt verifies it via DNS, then cert-manager cleans up the record.

cert-manager ships with built-in support for Route53, Cloudflare, Google Cloud DNS, Azure DNS, DigitalOcean, Akamai, ACMEDNS, and RFC-2136. Over 40 community webhook providers cover the rest.

Route53 (AWS)

Create an IAM policy scoped to TXT record mutations only:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "route53:GetChange",
      "Resource": "arn:aws:route53:::change/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "route53:ChangeResourceRecordSets",
        "route53:ListResourceRecordSets"
      ],
      "Resource": "arn:aws:route53:::hostedzone/*",
      "Condition": {
        "ForAllValues:StringEquals": {
          "route53:ChangeResourceRecordSetsRecordTypes": ["TXT"]
        }
      }
    },
    {
      "Effect": "Allow",
      "Action": "route53:ListHostedZonesByName",
      "Resource": "*"
    }
  ]
}

For authentication, EKS Pod Identity or IRSA are the recommended options. Both inject credentials automatically without storing long-term access keys. With Pod Identity or IRSA configured, the ClusterIssuer is minimal:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: devops@yourcompany.nl
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
    - dns01:
        route53: {}

An empty route53: {} block tells cert-manager to use ambient credentials. The region is inferred from AWS_REGION, which Pod Identity and IRSA inject automatically.

Ambient credentials (Pod Identity, IRSA, EC2 instance profile) only work for ClusterIssuer, not for namespace-scoped Issuer. This catches people on EKS who try namespace isolation with IRSA and find DNS-01 challenges silently failing.

Cloudflare

Create an API Token (not the legacy Global API Key) with permissions: Zone > DNS > Edit and Zone > Zone > Read.

Store it in a Secret:

apiVersion: v1
kind: Secret
metadata:
  name: cloudflare-api-token-secret
  namespace: cert-manager
type: Opaque
stringData:
  api-token: cfl_your_actual_api_token_here

Reference it in the ClusterIssuer solver:

solvers:
- dns01:
    cloudflare:
      apiTokenSecretRef:
        name: cloudflare-api-token-secret
        key: api-token

Mixing solvers

A single ClusterIssuer can combine DNS-01 for specific zones and HTTP-01 as a fallback:

solvers:
- selector:
    dnsZones:
    - "internal.yourcompany.nl"
  dns01:
    route53: {}
- http01:
    ingress:
      ingressClassName: nginx

cert-manager evaluates selectors top-down. The first matching solver wins; if none match, the last solver without a selector acts as the catch-all.

Issuing certificates via Ingress annotations

cert-manager includes an ingress-shim component that watches Ingress resources for specific annotations. When it finds one, it automatically creates a Certificate resource on your behalf.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dashboard
  namespace: team-alpha
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - dashboard.yourcompany.nl
    secretName: dashboard-tls       # cert-manager stores the cert here
  rules:
  - host: dashboard.yourcompany.nl
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: dashboard
            port:
              number: 8080

The annotation cert-manager.io/cluster-issuer: letsencrypt-prod triggers the shim. cert-manager reads the tls block, creates a Certificate for dashboard.yourcompany.nl, and stores the result in Secret dashboard-tls in namespace team-alpha.

Checkpoint. After applying the Ingress, verify the Certificate was created and is ready:

kubectl get certificate -n team-alpha

Expected output within 1–2 minutes:

NAME            READY   SECRET          AGE
dashboard-tls   True    dashboard-tls   90s

If no Certificate appears, check: (1) the annotation name is exact (typos are silent failures), (2) the tls block exists with secretName and hosts, and (3) cert-manager controller logs show no errors: kubectl logs -n cert-manager deploy/cert-manager --since=5m.

For the full list of supported annotations (duration, renewal window, key algorithm), see the ingress-shim documentation.

Issuing certificates via the Certificate resource

The Ingress annotation approach is convenient for straightforward cases. Use an explicit Certificate resource when you need:

  • Wildcard certificates (*.yourcompany.nl)
  • Certificates for non-HTTP services (gRPC, databases, mTLS)
  • Full control over key algorithm, duration, or renewal timing
  • Decoupled lifecycle from Ingress resources
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard-yourcompany-nl
  namespace: team-alpha
spec:
  secretName: wildcard-yourcompany-nl-tls
  duration: 2160h       # 90 days (Let's Encrypt maximum)
  renewBefore: 720h     # renew 30 days before expiry
  dnsNames:
  - "*.yourcompany.nl"
  - yourcompany.nl      # wildcard does not cover the apex domain
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  privateKey:
    algorithm: ECDSA
    size: 256

Quote *.yourcompany.nl in YAML to avoid parser issues with the wildcard character. The apex domain (yourcompany.nl) needs its own entry because a wildcard certificate does not cover the bare domain.

The resulting Secret contains:

Key Contents
tls.crt Signed certificate chain (leaf + intermediates)
tls.key Private key
ca.crt Issuing CA certificate (empty for Let's Encrypt)

Checkpoint. Verify issuance:

kubectl describe certificate wildcard-yourcompany-nl -n team-alpha

Look for Status: True on the Ready condition and a Certificate issued successfully event.

Key rotation

cert-manager v1.20 defaults to rotationPolicy: Always, meaning each renewal generates a fresh private key. Applications that cache TLS keys in memory need to detect Secret updates and reload. The Reloader controller or a sidecar like configmap-reload can automate this.

Gateway API integration

cert-manager integrates with Kubernetes Gateway API as of v1.15 (beta). Instead of annotating Ingress resources, you annotate the Gateway itself:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: production-gateway
  namespace: infra
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  gatewayClassName: envoy
  listeners:
  - name: https
    hostname: app.yourcompany.nl
    port: 443
    protocol: HTTPS
    tls:
      mode: Terminate
      certificateRefs:
      - name: app-yourcompany-nl-tls
        kind: Secret

cert-manager creates a Certificate named app-yourcompany-nl-tls in namespace infra for DNS name app.yourcompany.nl. The listener must have tls.mode: Terminate (Passthrough is not supported) and a non-empty hostname.

This approach fits the Gateway API role model: platform engineers own the Gateway and its TLS configuration, while application teams own HTTPRoutes independently.

Limitation: Certificates can only be created in the same namespace as the Gateway. Cross-namespace secret references are not supported.

A note on ingress-nginx

ingress-nginx reached end of life in March 2026. cert-manager's Ingress annotations still work with it and with other actively maintained controllers (Traefik, Contour, HAProxy). For new deployments, consider Gateway API. For existing ingress-nginx clusters, there is no urgency to rewrite working configurations, but plan a migration path.

Monitoring certificate expiry

cert-manager exposes Prometheus metrics on port 9402. The most important one:

certmanager_certificate_expiration_timestamp_seconds reports the Unix timestamp when each certificate expires, labeled by name, namespace, issuer_name, issuer_kind, and issuer_group.

Enable Prometheus scraping

Add these values to your cert-manager Helm release:

prometheus:
  enabled: true
  podmonitor:
    enabled: true       # creates a PodMonitor for the Prometheus Operator

PrometheusRule alerts

These four rules cover the most critical failure modes (adapted from Philip Schmid's community ruleset):

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: cert-manager-alerts
  namespace: monitoring
spec:
  groups:
  - name: cert-manager
    interval: 30s
    rules:

    - alert: CertManagerAbsent
      expr: absent(up{job="cert-manager"})
      for: 1h
      labels:
        severity: critical
      annotations:
        summary: "cert-manager has disappeared from Prometheus service discovery"

    - alert: CertManagerCertExpireSoon
      expr: |
        certmanager_certificate_expiration_timestamp_seconds - time() < (31 * 24 * 3600)
        unless on(name, namespace)
        certmanager_certificate_ready_status{condition!="True"} == 1
      for: 24h
      labels:
        severity: warning
      annotations:
        summary: >-
          Certificate {{ $labels.namespace }}/{{ $labels.name }}
          expires in {{ $value | humanizeDuration }}

    - alert: CertManagerCertNotReady
      expr: |
        max by (name, namespace, condition) (
          certmanager_certificate_ready_status{condition!="True"} == 1
        )
      for: 24h
      labels:
        severity: critical
      annotations:
        summary: "Certificate {{ $labels.namespace }}/{{ $labels.name }} is not ready"

    - alert: CertManagerACMEErrors
      expr: |
        sum by (host, status, method) (
          rate(certmanager_http_acme_client_request_count{status!~"2.."}[1h])
        ) > 0
      for: 4h
      labels:
        severity: warning
      annotations:
        summary: "cert-manager cannot reach ACME endpoint"

Manual inspection with cmctl

cmctl is the official CLI for cert-manager. Two commands I use regularly:

# Full status of a certificate, including related CertificateRequest, Order, and Challenge
cmctl status certificate dashboard-tls -n team-alpha

# Force immediate renewal (bypasses the 2/3-lifetime schedule)
cmctl renew dashboard-tls -n team-alpha

Troubleshooting

When a certificate is stuck, walk the resource chain top-down:

Certificate → CertificateRequest → Order → Challenge
# Step 1: is the Certificate ready?
kubectl get certificate -n team-alpha
kubectl describe certificate dashboard-tls -n team-alpha

# Step 2: what does the CertificateRequest say?
kubectl get certificaterequest -n team-alpha
kubectl describe certificaterequest <name> -n team-alpha

# Step 3: for ACME, inspect Order and Challenge
kubectl get order -n team-alpha
kubectl describe challenge <name> -n team-alpha

# Step 4: cert-manager controller logs (last resort)
kubectl logs -n cert-manager deploy/cert-manager --since=15m

Common failures

Symptom Likely cause
Challenge stuck in "pending" HTTP-01: port 80 blocked, or ingressClassName not set. DNS-01: wrong credentials or TXT record not propagating.
"too many certificates already issued" Hit the Let's Encrypt rate limit: 5 certificates per exact domain set per 7 days. Workaround: add or remove a SAN to change the set.
No Certificate created after Ingress annotation Missing cert-manager.io/cluster-issuer annotation (exact key, no typos) or missing tls block in Ingress spec.
DNS-01 failing on EKS with IRSA Using a namespace Issuer instead of ClusterIssuer. Ambient credentials only work for ClusterIssuers.

Let's Encrypt rate limits

Always validate with the staging issuer first. Production limits that trip up teams most often:

Limit Value
Certificates per registered domain 50 per 7 days
Certificates per exact identifier set 5 per 7 days
New orders per account 300 per 3 hours

Exceeding these requires waiting for the refill window. There is no manual reset.

When to escalate

Before opening an issue or contacting support, collect:

  • cert-manager version (helm list -n cert-manager)
  • Kubernetes version (kubectl version)
  • Full output of cmctl status certificate <name> -n <namespace>
  • cert-manager controller logs for the relevant time window
  • kubectl describe output for the ClusterIssuer, Certificate, CertificateRequest, Order, and Challenge

What you learned

  • cert-manager automates the full ACME lifecycle: registration, challenge validation, certificate issuance, Secret storage, and renewal.
  • Two ClusterIssuers (staging + production) protect you from rate limits during setup and debugging.
  • HTTP-01 is the simple path for public-facing services. DNS-01 is required for wildcard certificates and private clusters.
  • Ingress annotations (cert-manager.io/cluster-issuer) are convenient for standard HTTP workloads. The explicit Certificate resource gives full control and is required for wildcards, non-HTTP services, and Gateway API.
  • certmanager_certificate_expiration_timestamp_seconds is the Prometheus metric to alert on. Set a warning at 31 days, critical when the Certificate is not ready for 24 hours.
  • Troubleshoot by walking the resource chain: Certificate, CertificateRequest, Order, Challenge. cmctl status certificate shows the full picture in one command.

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.