Troubleshooting Docker Container Exit Code 139: Segmentation Fault (SIGSEGV)
Resolve Docker containers failing with exit code 139 and segmentation faults. This guide details causes like memory issues, corrupted binaries, and provides step-by-step solutions for robust container operation.
When a Docker container unexpectedly halts and exits with code 139, it’s a strong indication of a critical error known as a Segmentation Fault (SIGSEGV). This means the application inside your container attempted to access a memory location it wasn’t authorized to, or tried to access memory in an invalid way. Unlike a clean exit, this is an abnormal termination that points to a deep-seated issue within the application, its environment, or the underlying system. Resolving this requires a systematic debugging approach to pinpoint the exact cause.
Symptom & Error Signature
Users typically observe their container restarting continuously, failing to start, or suddenly disappearing from the list of running containers. The most direct symptom is seen when inspecting the container’s status or logs.
# Check the status of your containers
docker ps -a
You will likely see output similar to this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a1b2c3d4e5f6 my-app:latest "python app.py" 2 minutes ago Exited (139) 2 seconds ago my-python-app
Further investigation using docker logs might provide more context, but often for a segmentation fault, the application might crash before it can log anything meaningful to standard output or error.
# View container logs
docker logs a1b2c3d4e5f6
Potential log output (can vary greatly, sometimes silent):
[INFO] Application starting...
Segmentation fault (core dumped)
Or, if the application has a C/C++ component:
*** Error in `/usr/local/bin/my-app`: free(): invalid pointer: 0x00007f8b2c000040 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x8444a)[0x7f8b2b73b44a]
/lib/x86_64-linux-gnu/libc.so.6(+0x8a3c8)[0x7f8b2b7413c8]
...
======= Memory map: ========
...
Aborted (core dumped)
On the Docker host system, checking dmesg can sometimes reveal kernel-level details about the segfault, including the process name and the faulting address.
# Check kernel messages for segfaults
dmesg | grep -i "segfault"
Example dmesg output:
[12345.678901] my-app[12345]: segfault at 7f8b2c000040 ip 00007f8b2b7413c8 sp 00007fffa7e25b10 error 4 in libc-2.35.so[7f8b2b6b7000+192000]
Root Cause Analysis
A segmentation fault (SIGSEGV) indicates a low-level memory access violation. Pinpointing the exact cause can be challenging as it often originates from complex interactions. Here are the most common underlying reasons:
- Software Bugs: This is the most frequent cause. The application itself, or a library it depends on, contains a programming error. Examples include:
- Dereferencing a null pointer.
- Accessing an array out of its bounds (buffer overflow/underflow).
- Using memory after it has been freed (use-after-free).
- Stack overflow due to excessive recursion or large local variables.
- Memory Exhaustion (OOM - Out of Memory): While true OOM conditions typically result in an
Exited (137)(SIGKILL) due to the OOM killer, sometimes an allocation failure immediately preceding a crash can lead to a segfault. This can happen if the application tries to access an invalid pointer returned bymalloc(or similar) after an allocation request fails due to memory limits.- This is especially relevant if the container is running with strict memory limits (
--memory,--memory-swap).
- This is especially relevant if the container is running with strict memory limits (
- Corrupted Binaries or Libraries:
- The application binary or one of its dynamically linked libraries might be corrupted within the container image or on the host’s filesystem (if volumes are mounted). This could be due to a faulty build, download issues, or storage corruption.
- Incorrect Architecture or CPU Features:
- The application binary might have been compiled for a different CPU architecture (e.g., ARM binary on an x86 host without emulation) or relies on specific CPU instruction sets (e.g., AVX, SSE) that are not available or enabled on the host machine.
- Missing or Incompatible Libraries:
- The dynamic linker might fail to find a required shared library at runtime, or it finds an incompatible version, leading to undefined behavior and potential segfaults when the application attempts to call functions from that library.
- JVM/Runtime Issues:
- For applications running on Java Virtual Machines (JVMs) or other managed runtimes (e.g., Node.js, Python with native extensions), the segfault might occur within the runtime itself or in a native module/extension it uses, rather than directly in the application’s interpreted code.
- Host Hardware Issues:
- Less common but possible, faulty RAM on the Docker host can cause data corruption that manifests as segfaults in seemingly unrelated applications or containers.
Step-by-Step Resolution
Troubleshooting a code 139 error requires a systematic approach. Start with the easiest checks and progressively move to more complex debugging.
1. Review Container and Host System Logs
Begin by gathering as much information as possible from logs.
# Get the container ID from `docker ps -a`
CONTAINER_ID="a1b2c3d4e5f6"
# Check standard container logs
docker logs "${CONTAINER_ID}"
# Check for resource usage stats before the crash (if container runs briefly)
# Run `docker stats --no-stream` before or immediately after the crash if possible
# Or monitor it while trying to reproduce the issue: docker stats "${CONTAINER_ID}"
# Examine the Docker daemon logs
sudo journalctl -u docker.service -r --since "10 minutes ago"
# Check the kernel message buffer for segfaults
dmesg | grep -i "segfault\|error" | tail -n 20
[!IMPORTANT] The
dmesgoutput is crucial. It often provides the exact process name, memory address, and sometimes the library (.sofile) involved in the segfault. This can significantly narrow down your investigation.
2. Verify Resource Limits
Insufficient memory is a common cause, even if not directly an OOM kill.
# Check current memory limits for the container (if defined in run command or compose file)
# Example: docker run --rm --memory="512m" --memory-swap="1g" my-app:latest
# For docker-compose, check memory/mem_limit in your service definition.
# If limits are present, consider temporarily increasing them to rule out memory starvation.
# Example with increased memory:
# docker run --rm -it --memory="2g" --memory-swap="4g" my-app:latest /bin/bash
[!WARNING] Drastically increasing memory limits might mask an underlying memory leak in your application rather than fixing it. Use this as a diagnostic step, not necessarily a permanent solution, without further investigation.
3. Inspect Container Filesystem and Entrypoint
A corrupted or incompatible binary/library can trigger a segfault.
# Start a new container from the same image with an interactive shell
# This allows you to inspect the container's environment without running the problematic entrypoint
docker run --rm -it my-app:latest /bin/bash
# Inside the container:
# Check the application binary's dependencies
ldd /path/to/your/app/binary
# Check the architecture of the binary
file /path/to/your/app/binary
# Verify the integrity of shared libraries if you suspect corruption (e.g., re-install packages)
# Example for Debian/Ubuntu:
apt update && apt install -y coreutils # For `md5sum` if not present
md5sum /lib/x86_64-linux-gnu/libc.so.6 # Compare with known good checksum
[!IMPORTANT] Ensure the architecture of the binary (
filecommand output) matches your Docker host’s architecture (docker info | grep Architecture). Mismatches often causecode 139errors.
4. Reproduce and Debug in a Controlled Environment
If the issue is hard to pin down, try to reproduce it with debugging tools.
# Run the container with a simple command, then try to execute the problematic binary
docker run --rm -it my-app:latest /bin/bash
# Inside the container:
# If you have GDB (GNU Debugger) installed in your image:
# gdb /path/to/your/app/binary
# (gdb) run
# If GDB is not available, you might need to add it to your Dockerfile for debugging purposes:
# FROM my-base-image
# RUN apt update && apt install -y gdb strace
# COPY . /app
# WORKDIR /app
# CMD ["/bin/bash"] # Override entrypoint to manually debug
# Use strace to trace system calls if GDB is too heavy or you suspect syscall issues
# strace /path/to/your/app/binary
[!WARNING] Adding debugging tools like
gdborstraceto your production image increases its size and attack surface. Only do this for debugging and ensure they are removed before deploying to production.
5. Isolate and Test Application Outside Docker (if feasible)
If your application is simple enough, try running it directly on a compatible Linux host system to see if the issue persists. This helps rule out Docker-specific environment issues.
# On a Linux host with similar OS/libraries:
# apt install -y python3 # or other dependencies
# cd /path/to/your/app/source
# python3 app.py # or ./your_binary
6. Check for Base Image or Dependency Issues
Outdated or problematic base images/dependencies can introduce instability.
# Rebuild your Docker image, ensuring all packages are updated.
# In your Dockerfile:
# FROM some-base-image:latest
# RUN apt update && apt upgrade -y && apt clean # For Debian/Ubuntu based images
# If you use custom-built libraries, ensure they are compatible and built correctly.
[!IMPORTANT] Always try to use specific, tagged versions of base images (e.g.,
ubuntu:22.04) rather thanlatestfor production to ensure reproducibility. Update base images periodically to get security patches and bug fixes.
7. Address Architecture/CPU Feature Mismatches
Ensure your image is built for the correct architecture. If you’re using a multi-architecture build environment (e.g., Docker Desktop on M1 Mac building for x86_64), ensure the target architecture matches the production host.
# On your Docker host, verify its architecture
docker info | grep Architecture
# When building, explicitly target the architecture if cross-compiling or using buildx
# Example for x86_64:
# docker buildx build --platform linux/amd64 -t my-app:latest .
8. Consider Host Hardware or Kernel Issues
If all software-level checks fail, and multiple unrelated applications/containers exhibit similar code 139 errors, the issue might be with the host’s hardware (RAM) or kernel.
# On the Docker host:
# Run memory diagnostics (e.g., Memtest86+ from a bootable USB). This requires downtime.
# Ensure your host kernel is up-to-date
sudo apt update && sudo apt upgrade -y # For Ubuntu/Debian
sudo reboot # After kernel updates
[!WARNING] Host memory issues can be elusive and critical. If diagnosed, replace faulty RAM modules immediately. Kernel updates should always be performed in a controlled manner, ideally in a staging environment first.
