GitHub Actions Workflow Error: Runner Out of Disk Space During Build

Troubleshoot and resolve 'runner out of disk space' errors in GitHub Actions workflows. Optimize build artifacts, cache management, and runner strategies to prevent failures.


Introduction

As a seasoned DevOps engineer, encountering “out of disk space” errors during a CI/CD build is a common, yet frustrating, experience. This particular issue manifests in GitHub Actions workflows, halting your build process and preventing successful deployments. Whether you’re compiling a large application, pulling numerous dependencies, or generating substantial build artifacts, exhausting the runner’s ephemeral storage can bring your development pipeline to a screeching halt. This guide will walk you through diagnosing, understanding the root causes, and implementing robust solutions to overcome this challenge, ensuring your GitHub Actions workflows run smoothly and efficiently.

Symptom & Error Signature

Users typically observe their GitHub Actions workflow failing during a build, install, or package step. The workflow log will show an explicit error message indicating that the runner has run out of disk space. This can occur during various operations, such as:

  • Cloning a large Git repository (especially with deep history or many LFS objects).
  • Installing package manager dependencies (e.g., node_modules for Node.js, vendor/ for PHP Composer, ~/.m2/repository for Maven, Python virtual environments).
  • Compiling large projects or generating intermediate build artifacts.
  • Building Docker images with many layers or large base images.
  • Archiving or packaging final deliverables.

Common error messages you might encounter include:

Error: ENOSPC: no space left on device, write
No space left on device
/usr/bin/tar: write error: No space left on device
Run npm install
npm ERR! cb() never called!
npm ERR! This is an error with npm itself. Please report this error.
...
npm ERR! A complete log of this run can be found in:
npm ERR!     /home/runner/.npm/_logs/2023-01-01T00_00_00_000Z-debug-0.log
Error: Process completed with exit code 1.

If using Docker:

error building image: error creating layer with ID "sha256:...": write /var/lib/docker/overlay2/.../file: no space left on device

Root Cause Analysis

The underlying reasons for a GitHub Actions runner running out of disk space are multifaceted, stemming from the ephemeral nature of CI/CD environments and the demands of modern software development.

  1. Ephemeral Nature of Runners (GitHub-Hosted): GitHub-hosted runners provide a fresh virtual machine for each job, but they come with a fixed, albeit generous, amount of disk space (e.g., typically 14 GB for ubuntu-latest). While most of this is pre-occupied by system files and pre-installed tools, the available working space can still be insufficient for demanding builds.
  2. Accumulated Debris (Self-Hosted): Self-hosted runners, if not properly maintained, can accumulate large amounts of temporary files, Docker images, build caches, and artifacts from previous runs, gradually filling up the disk.
  3. Large Repositories: Repositories with extensive commit history, numerous large binary files (especially without Git LFS), or repositories containing other repositories as submodules can consume significant space during the actions/checkout step.
  4. Bulky Dependencies: Modern applications often rely on thousands of external packages. Package managers (npm, yarn, Composer, Maven, pip, cargo) can download and install hundreds of megabytes, or even gigabytes, of dependencies into node_modules, vendor, .m2, or virtual environments.
  5. Intermediate Build Artifacts: Compilers, transpilers, and build tools generate temporary files, object files, and intermediate assets during the build process. If these are not cleaned up, they can quickly consume available disk space. Examples include target/ directories in Java/Rust, or temporary directories created by build scripts.
  6. Docker Image Layers: When building Docker images, each instruction in the Dockerfile creates a new layer. If these layers contain large files or inefficiently cached steps, the resulting build context and intermediate images can become very large. Additionally, the Docker build cache itself can consume significant space on the runner.
  7. Large Output Artifacts: If your workflow generates extensive reports, logs, compiled binaries, or archives that are then staged for artifact upload, these temporary files can contribute to disk exhaustion before they are offloaded.
  8. Insufficient Runner Provisioning (Self-Hosted): The virtual machine or container hosting your self-hosted runner might simply be provisioned with inadequate disk resources for your specific build requirements.

Step-by-Step Resolution

Addressing “out of disk space” errors requires a systematic approach, combining proactive optimization with reactive cleanup strategies.

1. Diagnose Disk Usage Within the Workflow

The first step is to identify what is consuming the disk space and when during the workflow. Add diagnostic steps to your workflow to print disk usage before and after critical operations.

jobs:
  build:
    runs-on: ubuntu-latest # or your self-hosted runner label
    steps:
      - name: Check disk space before checkout
        run: df -h

      - uses: actions/checkout@v4
        with:
          fetch-depth: 0 # Only if full history is needed for LFS or complex git operations

      - name: Check disk space after checkout
        run: df -h

      - name: Install dependencies (e.g., Node.js)
        run: |
          npm install

      - name: Check disk space after dependency install
        run: |
          df -h
          echo "Top 10 largest directories in workspace:"
          sudo du -sh $(ls -A) | sort -rh | head -n 10
          echo "Top 10 largest directories in /var/lib/docker (if applicable):"
          sudo du -sh /var/lib/docker/* | sort -rh | head -n 10 # For Docker builds

      - name: Build project
        run: npm run build

      - name: Check disk space after build
        run: df -h

[!IMPORTANT] The du -sh commands require sudo as the runner user usually doesn’t have permissions to traverse all directories, especially /var/lib/docker. Analyzing these outputs will pinpoint the culprit directory.

2. Optimize Repository Size and Checkout Strategy

For large repositories, especially those with extensive history or binary files:

  • Use Git LFS: Ensure large binary files (images, videos, executables) are tracked with Git LFS (Large File Storage). This keeps the main Git repository lean.
  • Shallow Clones: For most CI/CD builds, you don’t need the entire commit history. Use fetch-depth: 1 with actions/checkout@v4 to perform a shallow clone, significantly reducing the downloaded size.
      - uses: actions/checkout@v4
        with:
          fetch-depth: 1 # Only fetches the last commit
          # submodules: true # Uncomment if you have submodules and need them

3. Implement Smarter Caching

Caching dependencies and build outputs can drastically reduce re-download and re-build times, and importantly, prevent repeated disk consumption. The actions/cache action is your primary tool.

      - name: Cache Node.js modules
        uses: actions/cache@v4
        with:
          path: ~/.npm # Path to the cache directory
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }} # Cache key based on OS and lock file
          restore-keys: |
            ${{ runner.os }}-node- # Fallback key

      - name: Install dependencies
        run: npm ci # Use npm ci for clean installs if a lock file exists

      - name: Cache Maven dependencies
        uses: actions/cache@v4
        with:
          path: ~/.m2/repository
          key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
          restore-keys: |
            ${{ runner.os }}-maven-

      - name: Cache Docker layers (if using buildx)
        uses: actions/cache@v4
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-docker-buildx-${{ github.sha }} # Use a key that changes with source code
          restore-keys: |
            ${{ runner.os }}-docker-buildx-

      # Example: Cache Composer dependencies
      - name: Cache Composer dependencies
        uses: actions/cache@v4
        with:
          path: vendor
          key: ${{ runner.os }}-php-${{ hashFiles('**/composer.lock') }}
          restore-keys: |
            ${{ runner.os }}-php-

[!IMPORTANT] A well-chosen cache key is crucial. It must change when dependencies change (e.g., package-lock.json, pom.xml), but remain stable otherwise to ensure cache hits.

4. Clean Up Intermediate Files & Docker Layers

Explicitly delete unnecessary files and optimize Docker builds to reduce footprint.

a. Workflow Cleanup Steps:

Add steps to remove large directories or temporary files after they are no longer needed.

      - name: Build the project
        run: npm run build

      - name: Remove node_modules to free space (if not needed for subsequent steps)
        run: rm -rf node_modules
        # This is useful if node_modules is only needed for building,
        # and then the output artifact is uploaded, without needing npm for later steps.

      - name: Remove temporary build directories (e.g., Java target)
        run: rm -rf target/
        # Adjust path based on your build system
b. Docker Build Optimization:
  • Multi-Stage Builds: Use multi-stage Docker builds to separate build-time dependencies from runtime dependencies, resulting in smaller final images.
  • .dockerignore: Use a .dockerignore file to prevent unnecessary files from being copied into the build context.
  • Remove Intermediate Layers: Use RUN --mount=type=cache,target=/root/.cache/go-build or RUN --mount=type=tmpfs,target=/tmp where available, and clean up temporary files created within RUN commands using rm -rf.
  • --squash (Experimental): While not recommended for production, for specific CI needs, the --squash flag for docker build can reduce the number of layers (and potentially size) of an image.
# Example Multi-stage Dockerfile
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

5. Prune Docker Images/Volumes (Self-Hosted Runners)

For self-hosted runners, Docker can consume vast amounts of disk space with old images, containers, and volumes. Regular pruning is essential.

      - name: Prune Docker system (for self-hosted runners)
        if: ${{ runner.os == 'Linux' && startsWith(runner.name, 'self-hosted') }} # Only run on Linux self-hosted runners
        run: |
          docker system prune -a -f --volumes # Removes all unused containers, networks, images (dangling and unreferenced), and volumes

[!WARNING] docker system prune -a -f --volumes is an aggressive command. Ensure no other critical services on the same host rely on these Docker resources before executing, or restrict it to dedicated runner hosts.

For periodic, automated cleanup on self-hosted runners, consider a Systemd timer:

Create /etc/systemd/system/docker-prune.service:

[Unit]
Description=Clean up Docker system
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/bin/docker system prune -a -f --volumes

Create /etc/systemd/system/docker-prune.timer:

[Unit]
Description=Run Docker prune daily

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target

Enable and start the timer:

sudo systemctl enable docker-prune.timer
sudo systemctl start docker-prune.timer

6. Allocate More Disk Space (Self-Hosted Runners)

If all optimization efforts fail to provide sufficient space, you may simply need more disk on your self-hosted runner.

  • Increase VM Disk Size: For cloud VMs (AWS EC2, Azure VM, Google Cloud Compute), increase the allocated disk size for the instance.

  • Extend Logical Volume (LVM): If your Linux system uses LVM, you can extend the logical volume to utilize newly added disk space.

    # Assuming /dev/sdX is the new disk/partition and /dev/vg0/lv_root is your root LVM
    sudo pvcreate /dev/sdX
    sudo vgextend vg0 /dev/sdX
    sudo lvextend -l +100%FREE /dev/vg0/lv_root # Extend to use all free space in vg0
    sudo resize2fs /dev/vg0/lv_root # For ext4 filesystem
    # For XFS: sudo xfs_growfs /
    df -h # Verify new space
  • Increase Container Quota: If your runner is running within a container (e.g., Docker container), ensure the host has enough space, and if applicable, increase any quota limits imposed on the container.

[!IMPORTANT] Always back up your data before performing disk resizing operations.

7. Review Artifact Uploads

If you’re uploading many or very large artifacts, ensure you’re only uploading what’s truly necessary. Compress artifacts (e.g., with tar -czf) before uploading to save space on the runner and reduce upload/download times.

      - name: Package artifacts
        run: |
          tar -czf my-app-artifacts.tar.gz ./dist ./reports
      
      - uses: actions/upload-artifact@v4
        with:
          name: app-build
          path: my-app-artifacts.tar.gz

8. Use Larger GitHub-Hosted Runners

As a last resort, or if you’re constrained by GitHub-hosted runner limits and unable to use self-hosted, GitHub offers larger runners for specific scenarios, though these come at a higher cost. Check GitHub’s documentation for the latest available runner sizes (e.g., ubuntu-latest-4-core, ubuntu-latest-8-core, ubuntu-latest-xlarge).

jobs:
  build:
    runs-on: ubuntu-latest-xlarge # Example of a larger runner
    # ... rest of your workflow