Docker Rootless Mode Security Hardening Checklist

Your Docker containers are running as root on the host right now — and most teams don’t realize a single container escape hands an attacker full system access. Rootless Docker security isn’t a nice-to-have in 2024; it’s the baseline you should have set up the moment you moved past local development. This checklist covers every step from installation through verification, including the three things that almost every team misses on their first production rollout.

Why This Checklist Exists

The default Docker daemon runs as root. That means the Docker socket at /var/run/docker.sock is owned by root, and any process — or container — that can reach it effectively has root on the host. Container escape CVEs like CVE-2019-5736 (runc) and CVE-2024-21626 work precisely because of this ownership model. The attack surface isn’t theoretical; it shows up in real penetration tests every week.

Rootless mode flips the model. Docker Engine 20.10.0 (released December 2020) made rootless stable — no more --experimental flag, no more excuses. In rootless mode, the daemon runs as an unprivileged user. UID 0 inside the container maps to your unprivileged host UID via Linux user namespaces. A successful container escape lands you as nobody on the host, not as root. The blast radius shrinks dramatically.

I’ve helped five different teams migrate to rootless over the past two years. Every single one had the same gaps. That’s why I wrote this checklist — not as a theoretical exercise, but as a reflection of what actually breaks in production and what actually gets missed in code review. If you’re building CI/CD pipelines on top of Docker, you’ll also want to check out the DevOps automation patterns on kuryzhev.cloud for complementary hardening techniques.

One more thing before we get into it: rootless mode does not protect against kernel-level exploits. It reduces privilege escalation risk, but you still need seccomp profiles, AppArmor or SELinux, and regular image scanning. This checklist addresses all of those layers. Kernel minimum for unprivileged user namespaces is 4.18; I’d recommend 5.11+ for native overlay2 support without the fuse-overlayfs fallback.

The Rootless Docker Security Checklist

Each item below has context. Don’t skim past the explanations — the “why” is what makes the difference between a setup that holds under pressure and one that quietly regresses after a system update.

Install the uidmap package. On Debian/Ubuntu: sudo apt install uidmap. On RHEL/Fedora: sudo dnf install shadow-utils. This provides newuidmap and newgidmap, the SUID helpers that allow the user namespace mappings. Without this, the setup script exits immediately.
Populate /etc/subuid and /etc/subgid. Each user running rootless Docker needs at least 65536 subordinate IDs. The entry looks like alice:100000:65536. Many Ubuntu installs add this automatically on user creation — verify before assuming it’s there with grep $USER /etc/subuid.
Install docker-ce-rootless-extras. This package ships the setup script at $HOME/bin/dockerd-rootless-setuptool.sh. Don’t try to run rootless without it — the manual path is fragile and version-sensitive.
Run the setup script as the target user. Execute dockerd-rootless-setuptool.sh install. This configures the systemd user unit, sets the socket path, and writes the environment file. Run it as the unprivileged user, not as root.
Enable and start the systemd user service. systemctl --user enable docker && systemctl --user start docker. Then verify with systemctl --user status docker. If it fails here, check journalctl --user -u docker — the error messages are actually useful.
Enable loginctl lingering. sudo loginctl enable-linger $USER. Without this, the Docker daemon dies when the user session ends. This is missed in roughly 80% of first-time setups. You won’t notice during development. You’ll notice at 3am when your CI runner logs out and all containers stop.
Switch the Docker context. docker context use rootless. If you skip this, your Docker CLI keeps talking to the root daemon at /var/run/docker.sock. Both daemons can coexist silently, which is exactly the kind of confusion that leads to security incidents.
Set DOCKER_HOST explicitly in CI. Export DOCKER_HOST=unix://${XDG_RUNTIME_DIR}/docker.sock in every CI job that uses Docker. The rootless socket lives at /run/user/1000/docker.sock, not the system socket. Hardcoding the old path in CI environment variables silently bypasses rootless entirely.
Enable cgroup v2 delegation. Create /etc/systemd/system/[email protected]/delegate.conf with Delegate=cpu cpuset io memory pids. Without this, --memory and --cpus flags are silently ignored. The container starts. No error. No limits. You find out when a runaway process OOMs the host.
Set net.ipv4.ip_unprivileged_port_start=0 in sysctl. Add it to /etc/sysctl.d/99-rootless-docker.conf and run sudo sysctl --system. Without this, rootless containers can’t bind ports below 1024. Your nginx container on port 80 fails silently on reboot. Watch out for this one — it works fine in testing if you use port 8080, then breaks in production when someone changes the port mapping.
Set the storage driver to overlay2. On kernels 5.11+, this works natively. On older kernels, install fuse-overlayfs 1.10+ as the fallback. Verify with docker info | grep "Storage Driver".
Use slirp4netns 1.1.12+ or pasta for networking. slirp4netns is the default and works well. Pasta (from the passt project) offers better performance — roughly 5% overhead versus 15-20% with slirp4netns for high-throughput workloads. If you’re running latency-sensitive microservices, this matters.
Apply no-new-privileges per container. Add --security-opt no-new-privileges:true to every container run command or Compose service definition. Rootless mode does NOT apply this automatically. SUID binaries inside the container can still escalate within the user namespace without this flag.
Drop all capabilities and re-add only what’s needed. Start with cap_drop: [ALL] in your Compose file, then add back only the specific capabilities the application requires. Audit this list quarterly — capability creep is real.
Set read_only: true on the root filesystem. This catches applications writing to unexpected locations and forces you to declare tmpfs mounts explicitly. It’s a forcing function for better application design, not just a security control.
Specify an explicit seccomp profile. Don’t rely on Docker’s built-in default. Use the Docker default seccomp profile as a starting point, then restrict it further for your specific workload. Reference it explicitly in your Compose file so it’s version-controlled and auditable.
Run Trivy 0.50+ against your images. Use trivy image --security-checks config --exit-code 1 myimage:latest to catch misconfigurations alongside CVEs. Exit code 1 on findings means your CI pipeline fails hard instead of just logging a warning nobody reads.
Verify no root processes after deployment. Run docker exec <container_id> id and confirm it does not return uid=0(root). Also check docker info --format '{{.SecurityOptions}}' and confirm name=rootless appears in the output.

Here’s the full automated setup script that handles items 1 through 8 idempotently. Run it as the target unprivileged user on Ubuntu 22.04+:

#!/usr/bin/env bash
# rootless-docker-setup.sh
# Automates rootless Docker installation and hardening on Ubuntu 22.04+
# Tested with Docker Engine 25.0.x and uidmap 1:4.13+
# Run as the TARGET unprivileged user, NOT as root

set -euo pipefail

DOCKER_USER="${USER}"
SUBUID_START=100000
SUBUID_COUNT=65536

echo "==> [1/7] Checking kernel user namespace support..."
# Kernel 4.18+ required; 5.11+ preferred for native overlay2
KERNEL_VER=$(uname -r | cut -d. -f1,2)
echo "    Kernel version: ${KERNEL_VER}"

echo "==> [2/7] Installing required packages (requires sudo)..."
sudo apt-get update -qq
sudo apt-get install -y uidmap dbus-user-session fuse-overlayfs slirp4netns

echo "==> [3/7] Configuring /etc/subuid and /etc/subgid..."
# Idempotent: only add if entry doesn't exist
if ! grep -q "^${DOCKER_USER}:" /etc/subuid; then
  echo "${DOCKER_USER}:${SUBUID_START}:${SUBUID_COUNT}" | sudo tee -a /etc/subuid
  echo "    Added subuid entry for ${DOCKER_USER}"
else
  echo "    subuid entry already exists, skipping"
fi

if ! grep -q "^${DOCKER_USER}:" /etc/subgid; then
  echo "${DOCKER_USER}:${SUBUID_START}:${SUBUID_COUNT}" | sudo tee -a /etc/subgid
  echo "    Added subgid entry for ${DOCKER_USER}"
fi

echo "==> [4/7] Enabling cgroup v2 delegation..."
# Without this, --memory and --cpus limits are silently ignored
sudo mkdir -p /etc/systemd/system/[email protected]/
sudo tee /etc/systemd/system/[email protected]/delegate.conf > /dev/null <<EOF
[Slice]
Delegate=cpu cpuset io memory pids
EOF
sudo systemctl daemon-reload

echo "==> [5/7] Enabling unprivileged port binding (for ports < 1024)..."
# Allows rootless nginx/httpd containers to bind port 80/443
echo "net.ipv4.ip_unprivileged_port_start=0" | sudo tee /etc/sysctl.d/99-rootless-docker.conf
sudo sysctl --system -q

echo "==> [6/7] Running rootless Docker setup tool..."
# Installs to $HOME/bin and configures systemd user unit
export FORCE_ROOTLESS_INSTALL=1
dockerd-rootless-setuptool.sh install

echo "==> [7/7] Enabling systemd user service and lingering..."
# loginctl enable-linger ensures daemon survives logout
systemctl --user enable docker
systemctl --user start docker
sudo loginctl enable-linger "${DOCKER_USER}"

echo ""
echo "==> Setting Docker context to rootless..."
docker context use rootless

echo ""
echo "==> Verifying rootless mode is active..."
# Should output: name=rootless among security options
docker info --format '{{.SecurityOptions}}' | grep -q "rootless" \
  && echo "    [PASS] Rootless mode confirmed" \
  || echo "    [FAIL] Rootless mode NOT detected — check daemon logs"

echo ""
echo "==> Setup complete. Socket: ${XDG_RUNTIME_DIR}/docker.sock"
echo "    Add to your shell profile: export DOCKER_HOST=unix://${XDG_RUNTIME_DIR}/docker.sock"

And here’s the hardened Compose service definition that implements items 9 through 16. This is the template we use as a baseline for every new service:

# docker-compose.hardened-rootless.yml
# Example hardened service definition for rootless Docker environment
# Docker Compose v2.24+ syntax
# All security options mandatory — do NOT remove for "convenience"

services:
  webapp:
    image: nginx:1.25-alpine
    # Never use 'latest' — pin digest in production:
    # image: nginx@sha256:...

    ports:
      - "8080:8080"  # Map to unprivileged port; rootless can't bind 80 without sysctl change

    security_opt:
      - no-new-privileges:true      # Prevents privilege escalation via SUID binaries
      - seccomp:./seccomp-default.json  # Explicit seccomp profile; don't rely on Docker default

    cap_drop:
      - ALL                          # Drop ALL capabilities first
    cap_add:
      - NET_BIND_SERVICE             # Re-add only what's needed; audit this list quarterly

    read_only: true                  # Root filesystem read-only; catches misconfigured apps

    tmpfs:
      - /tmp:size=64m,mode=1777      # Writable tmp with size cap to prevent disk exhaustion
      - /var/cache/nginx:size=32m
      - /var/run:size=8m

    user: "101:101"                  # Run as nginx user (non-root inside container)

    environment:
      - NGINX_ENTRYPOINT_QUIET_LOGS=1

    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:8080/health"]
      interval: 30s
      timeout: 5s
      retries: 3

    deploy:
      resources:
        limits:
          cpus: "0.50"
          memory: 128M              # Requires cgroup v2 delegation to actually enforce
        reservations:
          memory: 64M

# Verify enforcement after deploy:
# docker inspect <container_id> | jq '.[0].HostConfig.SecurityOpt'
# docker exec <container_id> id   # Must NOT return uid=0(root)

Commonly Missed Items

These are the gaps that pass initial review, get merged, and then surface as incidents. I’ve seen all three in production environments with experienced teams.

The Docker context problem. Teams enable rootless mode on their runner or host, verify it works manually, and then discover six months later that their CI pipeline has been talking to the root daemon the entire time. The cause is always the same: DOCKER_HOST=unix:///var/run/docker.sock hardcoded somewhere — a CI environment variable, a Makefile, a shell alias in a shared profile. The rootless socket lives at /run/user/1000/docker.sock. If your tooling doesn’t know that, it falls back silently. Run docker context ls and confirm the asterisk is on the rootless context, not the default one.

Cgroup v2 delegation. This is the most dangerous miss because it fails silently. You set --memory 128m on a container. The container starts. No error. But without the delegation config in place, the kernel never enforces the limit. Verify it’s working by checking cat /sys/fs/cgroup/user.slice/user-1000.slice/cgroup.controllers — you need to see cpu memory io in that output. If the file doesn’t exist or the controllers are missing, your resource limits are decorative. The error message when cgroup delegation is missing during container start is: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting cgroup. It’s not obvious that cgroup delegation is the cause.

Unprivileged port binding. Watch out for this one in environments where services need to bind ports below 1024. The symptom is clear — bind: permission denied when your nginx container tries to start on port 80 — but the timing is terrible. It works fine in staging because someone tested with port 8080. It breaks in production at 2am during a deployment. Set net.ipv4.ip_unprivileged_port_start=0 in sysctl and make it persistent. Don’t work around it by mapping external port 80 to internal 8080 — that introduces unnecessary complexity and breaks applications that rely on the Host header including the port.

The linger problem. I stopped trusting rootless setups that don’t explicitly verify loginctl linger after an incident where a CI runner’s Docker daemon disappeared mid-build following a session timeout. loginctl show-user $USER | grep Linger should return Linger=yes. If it doesn’t, the daemon dies when the SSH session or PAM session ends. This is a one-line fix that teams consistently skip because it works fine during initial setup when someone is logged in.

Automation Ideas

A checklist you run manually once is not a security control. It’s a starting point. Here’s how to codify rootless Docker security into your infrastructure so it can’t be accidentally skipped.

Ansible role for idempotent setup. The setup script above is a good start, but wrap it in an Ansible role for fleet management. Key tasks: assert kernel version, install uidmap via package module, populate subuid/subgid with lineinfile (idempotent), copy the delegate.conf template, run dockerd-rootless-setuptool.sh install with creates: ~/.config/systemd/user/docker.service so it’s skipped if already done, and run loginctl enable-linger via command module. The role should be idempotent end-to-end — running it twice on the same host should produce zero changes the second time.

CI gate on rootless verification. Before any deployment job runs, add a gate step that checks the Docker security options. This is a one-liner that fails the pipeline if rootless isn’t active:

# Add as first step in any CI job that uses Docker
# Fails immediately if rootless mode is not active on the runner
docker info --format '{{.SecurityOptions}}' | grep -q "name=rootless" \
  || { echo "ERROR: Docker is not running in rootless mode. Aborting deployment."; exit 1; }

OPA/Conftest policy enforcement. Write a Conftest policy that rejects Compose files or Kubernetes manifests requesting privileged: true, hostPID: true, or hostNetwork: true. Enforce it as a pre-commit hook using Conftest so violations are caught before they reach the pipeline. Combine this with the Trivy config scan in your CI to catch misconfigurations that slip past the pre-commit hook.

Scheduled drift detection. Cron a weekly check across all your Docker hosts that verifies rootless is still active, linger is still enabled, and cgroup delegation is still configured. Infrastructure drift is real — system updates, package upgrades, and well-meaning manual changes can all quietly undo your rootless setup. A five-line shell script running from cron and alerting to Slack is more reliable than assuming nothing changed.

Rootless Docker security is one of those controls that’s easy to set up correctly the first time and easy to silently break over time. The checklist handles the setup. The automation handles the maintenance. Both are necessary.

Docker Rootless Mode Security Hardening Checklist

Why This Checklist Exists

The Rootless Docker Security Checklist

Commonly Missed Items

Automation Ideas

Related

Leave a Reply Cancel reply

Why This Checklist Exists

The Rootless Docker Security Checklist

Commonly Missed Items

Automation Ideas

Related

Related Posts

Secure Docker Multi-Stage Builds with Trivy CVE Gates

AI vs Rule-Based Secret Scanning in CI: Which Actually Works

GitLab CI AWS OIDC: Replace Static Keys with Short-Lived Credentials

Leave a Reply Cancel reply