Vulnerability Analysis

CVE-2026-31431 (Copy Fail): Linux Kernel Privilege Escalation — What It Is & How to Fix It

Executive Summary

CVE-2026-31431, publicly nicknamed "Copy Fail", is a high-severity local privilege escalation (LPE) vulnerability in the Linux kernel's userspace cryptographic API subsystem (algif_aead), disclosed on April 29, 2026. Any unprivileged local user — or container process — can exploit a deterministic 4-byte page-cache write to corrupt a setuid binary in memory and obtain a root shell in seconds, without any race condition or heap spray. Patched kernels began rolling out from major vendors on May 1, 2026; systems that cannot be patched immediately must apply the algif_aead module workaround described below.


1. What Is This Vulnerability?

Technical Background

The Linux kernel exposes its in-kernel cryptographic primitives to userspace through the AF_ALG socket interface (CONFIG_CRYPTO_USER_API_AEAD). Since 2017, the algif_aead implementation added an in-place optimization (commit 72548b093ee3) to avoid double-buffering AEAD (Authenticated Encryption with Associated Data) operations: rather than allocating a fresh output scatterlist, it reused the source scatterlist with req->src = req->dst and chained the tag pages by reference using sg_chain().

The root flaw involves the authencesn(hmac(sha256),cbc(aes)) algorithm (AEAD with Extended Sequence Numbers, commonly used in IPsec). During decryption, this algorithm writes 4 bytes at offset assoclen + cryptlen as scratch space for Extended Sequence Number (seqno_lo) rearrangement. Because of the in-place reuse, the output scatterlist inadvertently extends into chained page-cache pages that were spliced in by the attacker via os.splice(). The 4-byte write therefore lands inside the file's kernel page cache — corrupting it in memory — without modifying the on-disk file and without triggering any permission check.

An attacker can weaponize this to overwrite 4 bytes inside the in-memory cached image of any readable setuid binary (e.g., /usr/bin/su, /usr/bin/sudo, /usr/bin/newgrp), redirecting execution flow to shellcode or a NOP sled before the setresuid(0,0,0) syscall, yielding a root shell.

Attack Vector

  1. Open an AF_ALG socket bound to authencesn(hmac(sha256),cbc(aes)).
  2. Use os.splice() to feed bytes from the target setuid binary's page cache into the AEAD decryption operation as the ciphertext scatterlist.
  3. Send exactly 8 bytes of associated data via sendmsg() to control the seqno_lo field.
  4. The in-place write overwrites 4 bytes at the computed offset inside the page cache.
  5. Execute the now-corrupted setuid binary — it runs as root.

The entire exploit fits in ~732 bytes of Python and completes in under 2 seconds. It is deterministic and does not depend on timing, heap layout, or kernel ASLR.

# Simplified conceptual sketch (NOT a working exploit)
import socket, struct, os, array

# 1. Bind AF_ALG socket to vulnerable AEAD algorithm
sock = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
sock.bind({"type": "aead",
           "name": "authencesn(hmac(sha256),cbc(aes))",
           "feat": 0, "mask": 0, "keylen": 48})
sock.setsockopt(socket.SOL_ALG, socket.ALG_SET_KEY, b"\x00" * 48)
sock.setsockopt(socket.SOL_ALG, socket.ALG_SET_AEAD_AUTHSIZE, None, 16)

op_sock, _ = sock.accept()

# 2. Splice victim file's page cache into the AEAD source scatterlist
# 3. sendmsg with 8 bytes AAD triggers the 4-byte scratch write into page cache
# 4. The on-disk file is untouched; only the in-memory cached bytes are altered

Real-World Impact

  • Publicly disclosed April 29, 2026; public PoC exploit code appeared on GitHub within hours.
  • CERT-EU and the Canadian Centre for Cyber Security (CCCS) issued advisories the same day.
  • Microsoft Security Blog (May 1, 2026) confirmed the vulnerability is exploitable across major cloud environments including Azure, AWS, and GCP Linux VMs.
  • University of Toronto's security team noted it also enables container escape in Kubernetes environments where the pod runs without seccomp restrictions.
  • A 732-byte PoC achieves root on Ubuntu 24.04, Amazon Linux 2023, RHEL 10.1, and SUSE 16.

2. Who Is Affected?

Distribution Affected Versions Patch Status (as of 2026-05-04)
Ubuntu 20.04, 22.04, 24.04 LTS Patched (USN available)
Red Hat / RHEL RHEL 8.x, 9.x, 10.1 Patched (RHSA released)
Amazon Linux AL2, AL2023 Patched
SUSE / openSUSE SUSE 15, SUSE 16 Patched
Debian Bullseye, Bookworm, Trixie Patched
AlmaLinux 8.x, 9.x Patched (announced May 1)
Fedora 40, 41, 42 Patched
Arch Linux Rolling Patched
Ubuntu 26.04+ (Resolute) Not affected (ships kernel without the 2017 commit)

All Linux kernel versions from 2017 to mainline pre-April 2026 that include commit 72548b093ee3 are vulnerable. The vulnerability cannot be exploited remotely — it requires local code execution or an already-breached container.

High-risk environments:

  • Multi-tenant systems (shared hosting, VPS providers)
  • Kubernetes nodes where pods run without hardened seccomp profiles
  • CI/CD runners accepting third-party job definitions
  • Developer workstations with untrusted local users

3. How to Detect It (Testing)

Manual Testing Steps

Step 1 — Check if AF_ALG is available

python3 -c "
import socket
try:
    s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
    print('AF_ALG available — system may be vulnerable')
    s.close()
except OSError as e:
    print(f'AF_ALG not available: {e}')
"

Step 2 — Check if the specific AEAD algorithm is reachable

python3 -c "
import socket
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
    s.bind({
        'type': 'aead',
        'name': 'authencesn(hmac(sha256),cbc(aes))',
        'feat': 0, 'mask': 0, 'keylen': 48
    })
    print('VULNERABLE: authencesn(hmac(sha256),cbc(aes)) is reachable')
except OSError as e:
    print(f'Not vulnerable or module absent: {e}')
finally:
    s.close()
"

Step 3 — Check kernel version and commit presence

uname -r
# Any kernel >= 4.10 (2017) and < patched vendor version is suspect

# Check if algif_aead module is loaded
lsmod | grep algif_aead
# If output is empty, module not loaded (but may still be loadable)

# Check if module can be loaded
modprobe --dry-run algif_aead 2>&1

Step 4 — Check if module load is blocked (workaround applied)

cat /etc/modprobe.d/disable-algif.conf 2>/dev/null
# Should show: install algif_aead /bin/false
# If file absent, workaround is NOT applied

Automated Scanning

Using Trivy (container/VM scanning):

# Scan a running container's kernel
trivy vm --scanners vuln --vuln-type os --ignore-unfixed <image>

# Scan host
trivy rootfs --scanners vuln /

# Look for CVE-2026-31431 in output
trivy vm --scanners vuln / 2>&1 | grep CVE-2026-31431

Using Qualys/Tenable:

  • QID / Plugin IDs for CVE-2026-31431 were released in late April / early May 2026.
  • Run an authenticated Linux kernel scan; filter results for CVE-2026-31431.

Using Lynis (host hardening auditor):

lynis audit system
# Review kernel-related findings

Using grype (for container images):

grype <image>:latest | grep CVE-2026-31431

Code Review Checklist

When auditing kernel-adjacent or container configurations:

  • Confirm seccomp profiles block AF_ALG socket creation (socket(AF_ALG, ...))
  • Verify container runtimes (Docker, containerd, podman) are using default seccomp profiles
  • Check that unprivileged userns is properly restricted: sysctl kernel.unprivileged_userns_clone
  • Confirm kernel version is on or above the patched baseline for your distro
  • Review any CI/CD runner containers for absence of --privileged or --security-opt seccomp=unconfined
  • Audit Kubernetes PodSecurityPolicy / SecurityContext for seccompProfile assignment

4. How to Fix It (Mitigation)

Step-by-Step Remediation

Primary Fix: Patch the Kernel

Ubuntu:

sudo apt update
sudo apt install --only-upgrade linux-image-generic linux-headers-generic
sudo reboot
# Verify patched version
uname -r
# Should be 6.x.y-ZZ-generic or later (check Ubuntu USN advisory for exact version)

RHEL / AlmaLinux / Amazon Linux:

sudo dnf update kernel
sudo reboot
uname -r

Debian:

sudo apt update
sudo apt upgrade linux-image-$(uname -r)
sudo reboot

SUSE / openSUSE:

sudo zypper patch
sudo reboot

Temporary Workaround: Disable algif_aead Module

For Debian/Ubuntu and module-based distros (where algif_aead is a loadable module):

# Block future loading
echo "install algif_aead /bin/false" | sudo tee /etc/modprobe.d/disable-algif.conf

# Unload if currently loaded
sudo rmmod algif_aead 2>/dev/null || echo "Module not loaded — nothing to unload"

# Verify
lsmod | grep algif_aead   # Should return empty

⚠️ Note: This workaround does not work on RHEL-family distros (RHEL, AlmaLinux, CloudLinux) where algif_aead is compiled into the kernel (CONFIG_CRYPTO_USER_API_AEAD=y). On these systems, patch the kernel immediately.

Workaround: Block via Seccomp (Containers & Kubernetes)

Block AF_ALG socket creation in container seccomp profiles:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "syscalls": [
    {
      "names": ["socket"],
      "action": "SCMP_ACT_ALLOW",
      "args": [
        {
          "index": 0,
          "value": 16,
          "op": "SCMP_CMP_NE"
        }
      ]
    }
  ]
}

AF_ALG has value 38 on x86_64. Block socket family 38 explicitly or use Docker's default seccomp profile (which already blocks AF_ALG in Docker 24+).

Kubernetes Pod SecurityContext:

apiVersion: v1
kind: Pod
spec:
  securityContext:
    seccompProfile:
      type: RuntimeDefault   # Blocks AF_ALG on containerd/CRI-O defaults
  containers:
  - name: app
    securityContext:
      allowPrivilegeEscalation: false
      runAsNonRoot: true
      capabilities:
        drop: ["ALL"]

Code Fix Example

The upstream kernel fix (mainline commit a664bf3d603d) reverts the 2017 in-place optimization in algif_aead.c, restoring out-of-place AEAD operations:

/* BEFORE (vulnerable) — algif_aead.c (2017 in-place optimization) */
if (sg_nents(rsgl) == sg_nents(tsgl)) {
    /* Reuse src as dst — BUG: chains page-cache pages into writable SGL */
    aead_request_set_crypt(areq->cra_u.aead.areq, tsgl, rsgl,
                           areqctx->used, iv);
    req->src = req->dst;   /* <-- root cause */
    sg_chain(areqctx->last_rsgl->sg, 1, areq->dst);
}

/* AFTER (patched) — always allocate separate destination scatterlist */
aead_request_set_crypt(areq->cra_u.aead.areq, tsgl,
                       areqctx->rsgl[0].sg,    /* independent dst SGL */
                       areqctx->used, iv);
/* Page-cache pages are never placed in a writable scatterlist */

Configuration Hardening

# 1. Restrict unprivileged user namespaces (reduces attack surface broadly)
sudo sysctl -w kernel.unprivileged_userns_clone=0
echo "kernel.unprivileged_userns_clone=0" | sudo tee -a /etc/sysctl.d/99-hardening.conf

# 2. Disable the userspace crypto API entirely if not needed
echo "install algif_skcipher /bin/false" | sudo tee -a /etc/modprobe.d/disable-algif.conf
echo "install algif_hash /bin/false"    | sudo tee -a /etc/modprobe.d/disable-algif.conf
echo "install algif_rng /bin/false"     | sudo tee -a /etc/modprobe.d/disable-algif.conf
echo "install algif_aead /bin/false"    | sudo tee -a /etc/modprobe.d/disable-algif.conf

# 3. Enable kernel lockdown mode (where supported)
echo "lockdown=confidentiality" | sudo tee -a /etc/default/grub  # append to GRUB_CMDLINE_LINUX
sudo update-grub

5. How to Test the Fix (Validation)

Regression Test Scenarios

  • Scenario A: Kernel reports patched version; authencesn AF_ALG bind returns ENODEV or ENOSYS.
  • Scenario B: The public PoC Python exploit exits with code 1 ("not vulnerable") after patching.
  • Scenario C: Applications using dm-crypt/LUKS, IPsec, TLS, SSH, OpenSSL, and GnuTLS continue to function normally (they do not use algif_aead).

Security Test Cases

Test Case 1: Verify the vulnerability no longer exists

  • Precondition: Patched kernel running or algif_aead module disabled.
  • Steps:
    1. As an unprivileged user, run the detection script (Step 2 from §3).
    2. Attempt to bind an AF_ALG socket to authencesn(hmac(sha256),cbc(aes)).
  • Expected Result: OSError: [Errno 2] No such file or directory or similar refusal.

Test Case 2: Verify attack vector is blocked (seccomp)

# Run inside the container to verify AF_ALG is blocked
docker run --rm --security-opt seccomp=default ubuntu:24.04 \
  python3 -c "
import socket
try:
    s = socket.socket(38, socket.SOCK_SEQPACKET, 0)
    print('FAIL: AF_ALG socket creation succeeded — seccomp not effective')
except PermissionError:
    print('PASS: AF_ALG socket creation blocked by seccomp')
"

Test Case 3: Verify no functional regression for standard crypto

# LUKS/dm-crypt should still work
sudo cryptsetup luksDump /dev/sda3    # Should display header, not error

# OpenSSL AES-GCM (uses kernel via /dev/crypto or its own implementation)
openssl enc -aes-256-gcm -e -in /dev/urandom -out /dev/null -k test -pass pass:test -pbkdf2 -nosalt 2>&1 | head -1
# Should complete without error

# SSH connection test
ssh -o StrictHostKeyChecking=no localhost echo "SSH works"

Automated Tests

#!/usr/bin/env python3
"""
CVE-2026-31431 Patch Validation Test
Verifies the vulnerability is mitigated without performing any exploitation.
Exit codes: 0 = secure, 1 = test error, 2 = VULNERABLE
"""
import socket, sys, subprocess, re

def check_kernel_version():
    result = subprocess.run(["uname", "-r"], capture_output=True, text=True)
    version = result.stdout.strip()
    print(f"[*] Kernel version: {version}")
    return version

def check_af_alg_reachable():
    try:
        s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
    except OSError:
        print("[+] PASS: AF_ALG socket creation failed — not reachable")
        return False

    try:
        s.bind({
            "type": "aead",
            "name": "authencesn(hmac(sha256),cbc(aes))",
            "feat": 0, "mask": 0, "keylen": 48
        })
        s.close()
        print("[!] FAIL: authencesn(hmac(sha256),cbc(aes)) is reachable — VULNERABLE")
        return True
    except OSError as e:
        s.close()
        print(f"[+] PASS: Algorithm not reachable: {e}")
        return False

def check_module_blocked():
    result = subprocess.run(
        ["modprobe", "--dry-run", "algif_aead"],
        capture_output=True, text=True
    )
    if "disabled" in result.stderr or result.returncode != 0:
        print("[+] PASS: algif_aead module is blocked by modprobe config")
        return True
    print("[-] INFO: algif_aead module is loadable (not blocked by modprobe)")
    return False

if __name__ == "__main__":
    print("=== CVE-2026-31431 (Copy Fail) Validation Test ===")
    kernel = check_kernel_version()
    vulnerable = check_af_alg_reachable()
    blocked = check_module_blocked()

    if not vulnerable:
        print("\n[RESULT] System appears SECURE against CVE-2026-31431")
        sys.exit(0)
    elif blocked:
        print("\n[RESULT] Partially mitigated (module blocked, but check RHEL-family built-in kernels)")
        sys.exit(0)
    else:
        print("\n[RESULT] System is VULNERABLE — apply kernel patch or workaround immediately")
        sys.exit(2)

6. Prevention & Hardening

Best Practices

Practice 1 — Maintain a kernel patching cadence Linux kernels receive security patches regularly. Automate kernel patching on a weekly cadence in non-production environments and establish a 72-hour SLA for critical/high CVEs in production. Use unattended-upgrades (Debian/Ubuntu) or dnf-automatic (RHEL) with the security filter.

Practice 2 — Harden container seccomp profiles The Docker default seccomp profile blocks ~44 syscalls. Ensure your container runtime uses it (it is not the default in all Kubernetes CRI configurations). Audit all workloads for --security-opt seccomp=unconfined, which completely disables seccomp.

Practice 3 — Disable unused kernel modules Adopt a deny-by-default module policy. Maintain /etc/modprobe.d/blacklist.conf to block modules your workloads never need: algif_*, dccp, tipc, rds, sctp, n-hdlc. This eliminates entire attack surface classes.

Practice 4 — Audit privileged access on multi-tenant systems CVE-2026-31431 requires local code execution. On shared hosting or developer platforms, review what processes run as unprivileged local users and whether their execution environment provides AF_ALG access.

Practice 5 — Apply CIS Benchmark Level 2 kernel hardening The CIS Linux Benchmark recommends disabling the userspace crypto API, restricting user namespaces, enabling kernel pointer restriction (kptr_restrict=2), and enforcing dmesg_restrict=1. These controls would have reduced the exploitability of Copy Fail.

Monitoring & Detection

Detect exploitation attempts in real time:

# Monitor for AF_ALG socket creation by non-root users (auditd)
sudo auditctl -a always,exit -F arch=b64 -S socket -F a0=38 -F uid!=0 \
  -k af_alg_watch

# Review audit log
sudo ausearch -k af_alg_watch | grep -E "uid=[1-9]"

Falco rule (Kubernetes/container environments):

- rule: AF_ALG Socket Creation by Unprivileged User
  desc: Detects potential CVE-2026-31431 exploitation attempt
  condition: >
    syscall.type=socket and
    socket.family=AF_ALG and
    not user.uid=0
  output: >
    Potential Copy Fail exploit attempt
    (user=%user.name uid=%user.uid container=%container.name
     image=%container.image.repository:%container.image.tag)
  priority: CRITICAL
  tags: [CVE-2026-31431, privilege_escalation, linux_kernel]

Prometheus / AlertManager — kernel version drift alert:

# Alert when any node is running an unpatched kernel
- alert: UnpatchedKernelCVE202631431
  expr: node_uname_info{release!~".*patched_version_pattern.*"} == 1
  for: 1h
  labels:
    severity: critical
  annotations:
    summary: "Node {{ $labels.nodename }} running unpatched kernel (CVE-2026-31431)"

References

Latest from the blog

See all →