Kubernetes CI/CD with GitHub Actions: building and deploying container images

A GitHub Actions workflow that builds a container image, pushes it to a registry, and rolls out the new version to a Kubernetes cluster is a three-part problem: build, authenticate, deploy. Each part has a right way and a tempting wrong way. This tutorial walks through a production-grade pipeline: building and tagging images with docker/build-push-action, pushing to GHCR, authenticating to a cluster with short-lived OIDC credentials or a scoped ServiceAccount, and triggering a rollout. It ends with a complete workflow file you can drop into your repository.

What you will learn

By the end of this tutorial you will have a working GitHub Actions workflow that builds a Docker image, tags it with the Git commit SHA, pushes it to GitHub Container Registry, and updates a Kubernetes Deployment to use the new image. You will understand why OIDC beats stored kubeconfigs, how to scope a ServiceAccount so a compromised workflow cannot take over the cluster, and the trade-off between imperative kubectl set image and declarative manifest commits.

Prerequisites

  • A Kubernetes cluster you can reach from GitHub-hosted runners (managed GKE, EKS, AKS, DigitalOcean, or a self-hosted cluster with a public API endpoint or a self-hosted runner inside the network)
  • A GitHub repository containing your application source and a Dockerfile
  • An existing Kubernetes Deployment for the application (if you do not have one yet, a Docker Compose to Kubernetes migration walks through creating one)
  • kubectl 1.28 or later installed locally for cluster setup commands
  • Familiarity with Kubernetes RBAC so the scoped ServiceAccount in step 3 makes sense

Table of contents

CI builds images, CD updates clusters

A Kubernetes deploy pipeline has two halves that do fundamentally different jobs. Mixing them in one monolithic step is the most common reason these workflows become brittle.

Continuous integration (CI) takes source code and produces an immutable artifact. For a containerized application that artifact is a Docker image. The image gets a unique, content-addressable name, typically the Git commit SHA, and lands in a registry. At this stage nothing has changed in your cluster.

Continuous deployment (CD) takes an existing artifact and applies it to a cluster. The cluster's Deployment is patched to use the new image tag. Kubernetes performs a rolling update and the new version replaces the old one pod by pod.

The split matters because CI is idempotent and safe to retry, while CD changes live production. You want CI to run on every commit, but CD only on the right branch, after tests pass, ideally with a manual approval gate for production. GitHub Actions models this with separate jobs that depend on each other via needs:.

Set up GHCR authentication

GitHub Container Registry (GHCR) is the path of least resistance because it needs no separate account, no stored credentials, and the built-in GITHUB_TOKEN handles authentication. GitHub automatically grants admin permission to packages in the repository that publishes them.

The workflow needs two permissions blocks. contents: read to check out the repository, and packages: write to push to GHCR:

permissions:
  contents: read
  packages: write

Logging in uses docker/login-action (v4.1.0 at the time of writing) with github.actor as the username and the auto-injected GITHUB_TOKEN as the password:

- name: Log in to GHCR
  uses: docker/login-action@v4
  with:
    registry: ghcr.io
    username: $
    password: $

The token is scoped to the workflow run and expires when the job ends. There is nothing to rotate, nothing to leak into CI logs, nothing to revoke if a contributor leaves.

Docker Hub and ECR differ. Docker Hub needs a personal access token stored as a repository secret (DOCKERHUB_TOKEN) and the same docker/login-action with registry: docker.io. Amazon ECR uses aws-actions/configure-aws-credentials to obtain short-lived credentials via OIDC, then aws-actions/amazon-ecr-login performs the Docker login. The build step that comes next is identical regardless of registry.

Build and tag the Docker image

Two rules drive the build step. First, tag with the commit SHA, never with :latest. Second, let docker/metadata-action generate the tag list so the logic lives in declarative configuration instead of ad-hoc shell.

Why not :latest? Because :latest is a pointer, not a version. If a pod restarts and the node's image cache has expired, Kubernetes may pull a different image than the one that was running. Rollbacks become impossible because there is no record of which :latest was live when. The fix is to make every image tag content-addressable. The commit SHA is the obvious choice: it is unique, immutable, and traceable back to the exact source that built it.

Chain three actions. docker/setup-buildx-action provisions the Buildx builder. docker/metadata-action computes the tag list. docker/build-push-action (v7.1.0) runs the build and push:

- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v4

- name: Generate image metadata
  id: meta
  uses: docker/metadata-action@v6
  with:
    images: ghcr.io/$
    tags: |
      type=sha,format=long          # ghcr.io/org/repo:sha-abc123def456...
      type=ref,event=branch         # ghcr.io/org/repo:main
      type=semver,pattern=  # ghcr.io/org/repo:1.4.2 on tag push

- name: Build and push
  uses: docker/build-push-action@v7
  with:
    context: .
    push: true
    tags: $
    labels: $
    cache-from: type=gha            # pull layer cache from GitHub Actions cache
    cache-to: type=gha,mode=max     # store every layer for next run

The cache-from and cache-to lines cut build times significantly on incremental changes. Without them, every run rebuilds every layer from scratch.

Checkpoint. After this step runs successfully, the repository's Packages tab shows the image with at least two tags: main and sha-<40-character-hash>. The long-form SHA tag is the one CD will pin to.

Authenticate kubectl to the cluster

This is where most pipelines go wrong. The tempting shortcut is to base64 a kubeconfig file, paste it into a GitHub secret, and echo $KUBECONFIG | base64 -d > ~/.kube/config in the workflow. It works. It also hands any compromised workflow permanent, unbounded access to your cluster.

Use OIDC for managed clusters. Every major cloud now supports federated identity: GitHub Actions requests a short-lived token directly from the cloud provider, which then issues temporary cluster credentials scoped to the workflow. GitHub's OIDC documentation describes the mechanism: tokens are "only valid for a single job, and then automatically expire." No long-lived secrets in GitHub.

The setup varies by provider. For EKS:

permissions:
  id-token: write     # required for OIDC token request
  contents: read

steps:
  - name: Configure AWS credentials
    uses: aws-actions/configure-aws-credentials@v6
    with:
      role-to-assume: arn:aws:iam::123456789012:role/github-actions-deployer
      aws-region: eu-west-1

  - name: Update kubeconfig
    run: aws eks update-kubeconfig --name production-cluster --region eu-west-1

The IAM role's trust policy restricts which repository and branch can assume it, so a fork or feature branch cannot steal production access. The AWS Configure Credentials README shows the full trust-policy pattern with repo:<org>/<repo>:ref:refs/heads/main as the condition.

For GKE, swap to google-github-actions/auth@v2 with Workload Identity Federation. For AKS, azure/login@v2 with a federated credential on a Managed Identity. The principle is identical: no stored keys.

Use a scoped ServiceAccount token for self-hosted clusters. If your cluster is not on a cloud provider that supports OIDC federation, create a namespace-scoped ServiceAccount with the minimum RBAC permissions it needs, then generate a short-lived token per workflow run. The Kubernetes RBAC guide covers the pattern in detail. Minimum viable role:

# cluster-side manifest. Apply once, commit to your GitOps repo
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ci-deployer
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: deployer
  namespace: production
rules:
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "patch", "update"]   # no create, no delete
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list"]                      # read-only for rollout status
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ci-deployer
  namespace: production
subjects:
  - kind: ServiceAccount
    name: ci-deployer
    namespace: production
roleRef:
  kind: Role
  name: deployer
  apiGroup: rbac.authorization.k8s.io

That account can patch existing Deployments in one namespace. It cannot create workloads, cannot read Secrets, cannot touch anything in other namespaces, and cannot escalate. A stolen token is a contained blast radius, not a cluster takeover.

Generate a fresh token per run (Kubernetes 1.24+):

# Run this on your admin workstation or as part of an out-of-band step
kubectl -n production create token ci-deployer --duration=1h

Store that token as a repository secret only if you cannot use OIDC. Even then, rotate it often. The moment you have a path to OIDC, take it.

Install kubectl in the runner. The azure/setup-kubectl action (v5.1.0) handles this without needing Azure itself:

- name: Install kubectl
  uses: azure/setup-kubectl@v5
  with:
    version: v1.30.0          # pin to match your cluster's minor version

Imperative vs declarative deployment

You have built and pushed an image. You have kubectl authenticated. Now the Deployment needs to pick up the new image. Two patterns exist, and they differ in where the source of truth lives.

Imperative: kubectl set image. One line in the workflow patches the live Deployment:

- name: Update deployment
  run: |
    kubectl set image deployment/my-app \
      my-app=ghcr.io/$:sha-$ \
      -n production

    # Wait for rollout to finish; fail the job if it does not
    kubectl rollout status deployment/my-app -n production --timeout=5m

The deploy is immediate. The cluster is the source of truth, so the live state is what is running. The trade-off: your Git repository no longer reflects reality. Someone reading your manifests sees the old image tag, and a kubectl apply from Git would silently downgrade production. Disaster recovery means rebuilding and redeploying rather than replaying manifests.

Use the imperative pattern when speed matters and you accept that the cluster is authoritative.

Declarative: commit the manifest change. The workflow edits the manifest file, commits it, and either applies from Git or lets a GitOps controller pick it up:

- name: Update manifest
  run: |
    yq eval -i \
      '.spec.template.spec.containers[0].image = "ghcr.io/$:sha-$"' \
      k8s/production/deployment.yaml

- name: Commit and push
  run: |
    git config user.name "github-actions[bot]"
    git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
    git add k8s/production/deployment.yaml
    git commit -m "deploy: update my-app to sha-$"
    git push

Git stays the source of truth. Every deploy is a commit with a message, a diff, and a reviewable history. Rollbacks become git revert. The trade-off: the workflow needs write access to the manifest repository, which means a token with contents: write or a dedicated deploy key, and a second loop to reconcile.

For most teams, declarative wins because it makes audit, review, and rollback trivially easy. That loop is best handled by Argo CD rather than a direct kubectl apply from the workflow. The next-steps section comes back to this.

What this workflow is NOT

Three misconceptions regularly sink CI/CD pipelines before they reach production. Name them explicitly.

It is not "store kubeconfig in GitHub Secrets." A kubeconfig is long-lived, full-cluster credentials. Storing it in a repository secret is functionally equivalent to publishing your cluster's root password under paste bin access controls. Use OIDC federation for managed clusters. Use a namespace-scoped ServiceAccount with a short-lived token for self-hosted clusters. The minute you type echo "$" | base64 -d, stop and pick a different approach.

It is not "use :latest as the Docker tag." The :latest tag is a pointer that can change without warning. Kubernetes uses image digests internally, but :latest triggers inconsistent pulls across nodes, breaks rollback, and makes incident postmortems guesswork. Tag with the commit SHA. If you want human-readable tags for production releases, add semantic version tags alongside the SHA, never instead of it. The GitHub Actions CI/CD best practices guide flags :latest as a direct anti-pattern.

It is not "you need Argo CD or Jenkins to deploy to Kubernetes." A single GitHub Actions workflow with kubectl set image or kubectl apply -f is enough for a small team running one or two environments. Argo CD and Flux pay off when you have multiple clusters, many applications, or strict audit requirements. Until then, the simplest pipeline that builds, pushes, and patches is the right tool. Graduate when the pain of manual orchestration exceeds the operational cost of another controller.

Complete workflow file

Put this in .github/workflows/deploy.yaml. Comments explain each non-obvious line:

name: Build and deploy to Kubernetes

on:
  push:
    branches: [main]       # CI runs on every push; only main triggers CD

concurrency:
  group: deploy-production
  cancel-in-progress: false  # never cancel a mid-flight deploy

jobs:
  build:
    name: Build and push image
    runs-on: ubuntu-latest

    permissions:
      contents: read           # checkout
      packages: write          # push to GHCR

    outputs:
      image-tag: $  # pass SHA tag to deploy job

    steps:
      - uses: actions/checkout@v5

      - name: Log in to GHCR
        uses: docker/login-action@v4
        with:
          registry: ghcr.io
          username: $
          password: $

      - name: Set up Buildx
        uses: docker/setup-buildx-action@v4

      - name: Generate image metadata
        id: meta
        uses: docker/metadata-action@v6
        with:
          images: ghcr.io/$
          tags: |
            type=sha,format=long
            type=ref,event=branch

      - name: Build and push
        uses: docker/build-push-action@v7
        with:
          context: .
          push: true
          tags: $
          labels: $
          cache-from: type=gha
          cache-to: type=gha,mode=max

  deploy:
    name: Deploy to production
    runs-on: ubuntu-latest
    needs: build               # blocks until build succeeds
    environment: production    # opt in to GitHub environment protections

    permissions:
      id-token: write          # request OIDC token from GitHub
      contents: read

    steps:
      - uses: actions/checkout@v5

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v6
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-deployer
          aws-region: eu-west-1

      - name: Install kubectl
        uses: azure/setup-kubectl@v5
        with:
          version: v1.30.0

      - name: Update kubeconfig
        run: aws eks update-kubeconfig --name production-cluster --region eu-west-1

      - name: Roll out new image
        run: |
          IMAGE=ghcr.io/$:sha-$
          kubectl set image deployment/my-app my-app=$IMAGE -n production
          kubectl rollout status deployment/my-app -n production --timeout=5m

      - name: Verify pods are healthy
        run: |
          # Exit non-zero if any pod is not Running and Ready
          kubectl get pods -n production -l app=my-app \
            -o jsonpath='{range .items[*]}{.status.phase}{"\n"}{end}' \
            | grep -v Running && exit 1 || echo "All pods Running"

Final verification. After the workflow completes on a push to main:

  1. The Actions tab shows both jobs green.
  2. kubectl describe deployment my-app -n production shows the Image: line pointing at ghcr.io/<org>/<repo>:sha-<commit>.
  3. kubectl rollout history deployment/my-app -n production includes a new revision.
  4. Requests to the service return responses from the new version.

What you learned

You built a CI/CD pipeline that obeys three rules.

Build once, deploy anywhere. The image is an immutable artifact tagged with the commit SHA. The same artifact that passed CI is what gets deployed, not a rebuild that might differ.

Short-lived credentials only. OIDC federation for managed clusters, namespace-scoped ServiceAccount tokens for self-hosted ones. No long-lived kubeconfigs, no static cloud keys in GitHub Secrets.

Two jobs, two jobs' worth of permissions. The build job has packages: write. The deploy job has id-token: write. Neither has more than it needs. A compromised build step cannot deploy. A compromised deploy step cannot push images.

Where to go next. When you run multiple environments or clusters, the imperative kubectl set image pattern starts to strain. At that point, graduate to the declarative model: the workflow commits the new image tag to a manifest repository, and Argo CD reconciles it into each cluster. You keep the same build step; the deploy step becomes git commit. For applications that need safer rollouts than the default Deployment strategy, pair this workflow with zero-downtime rolling update configuration so the cluster handles pod replacement gracefully while GitHub Actions handles everything upstream of it.

Recurring server or deployment issues?

I help teams make production reliable with CI/CD, Kubernetes, and cloud—so fixes stick and deploys stop being stressful.

Explore DevOps consultancy

Search this site

Start typing to search, or browse the knowledge base and blog.