ChatGPhish ChatGPT Phishing Flaw Remains Unpatched

Table of Contents

Permiso Security researcher Andi Ahmeti publicly disclosed ChatGPhish on May 29, 2026 — a vulnerability in ChatGPT's web summarization feature that allows an attacker to inject phishing links, harvest IP addresses and browser metadata, and deliver malicious QR codes through ChatGPT's own interface, with no user warning of any kind. As of publication, OpenAI has not patched the vulnerability and has not assigned a CVE.

// 01 ChatGPhish: Technical Details

ChatGPhish exploits a fundamental trust model flaw in how ChatGPT's response renderer handles external web content. When a user asks ChatGPT to summarize a web page, ChatGPT fetches that page, parses its content, and renders the response — including any Markdown links and image URLs found on the source page — as part of its output.

The critical flaw is this: ChatGPT cannot distinguish between Markdown it generated itself and Markdown it extracted from an attacker-controlled web page. The response renderer treats both sources identically — surfacing links as live, clickable elements and auto-fetching image URLs without displaying any origin warning, provenance label, or sandboxing.

An attacker exploiting ChatGPhish follows these steps:

Create a public web page (a GitHub README, a blog post, a documentation page) containing malicious Markdown payloads embedded in otherwise normal-looking content.
Wait for, or socially engineer, a victim to ask ChatGPT to summarize that page.
ChatGPT fetches the page, extracts the embedded Markdown, and renders it as trusted output in the victim's chat window.
The victim sees clickable phishing links or QR codes styled identically to ChatGPT's own interface output — with no visual distinction from content ChatGPT generated itself.

ChatGPhish attack flow — Markdown trust exploitation in ChatGPT web summarization

Attack vectors enabled by ChatGPhish:

Phishing link injection: Clickable malicious URLs rendered inside ChatGPT's interface, visually indistinguishable from ChatGPT's own links. Attackers can style surrounding text as fake "account security" alerts in ChatGPT's visual language.
Passive IP and metadata tracking: Markdown !image syntax causes ChatGPT's renderer to auto-fetch the remote image URL at render time, without any user click. This sends the victim's IP address, User-Agent string, Referer header, and precise timing data to the attacker's server.
QR code delivery: Attackers embed a QR code image (hosted on attacker infrastructure, such as an S3 bucket) in the rendered output. Victims who scan the QR code on a mobile device bypass desktop URL filters and enterprise proxy controls, as the redirect target only resolves after scanning.
Cross-device pivoting: The QR code attack inherently moves the victim from a potentially monitored corporate desktop to a personal mobile device outside enterprise DLP controls — a significant evasion advantage.

Ahmeti demonstrated all four attack vectors with working proof-of-concept examples targeting real users. No public PoC repository has been released, limiting mass exploitation risk for now.

// 02 Exploitation Status and Threat Landscape

No CVE has been assigned and no patch has been confirmed as of May 30, 2026.

Permiso Security submitted the vulnerability to OpenAI through the Bugcrowd bug bounty platform on April 29, 2026. OpenAI's initial response marked the report "could not be reproduced." After Permiso resubmitted on May 7, 2026, with additional attack vector documentation, OpenAI classified it as a "duplicate" of a previously reported issue. OpenAI did not respond to press inquiries about the fix timeline before Permiso's public disclosure on May 29, 2026.

No evidence of active exploitation in the wild has been reported as of publication. However, the combination of public disclosure, a working proof-of-concept in researcher hands, and a user base of hundreds of millions makes this a high-attention vulnerability for threat actors. The attack requires no special tooling — any attacker who can create a web page can execute it.

CISA KEV status: ChatGPhish is not currently listed in CISA's Known Exploited Vulnerabilities catalog. No CVE exists to catalog against, and confirmed wild exploitation has not been reported. This status is subject to rapid change if exploitation is observed.

The severity of ChatGPhish cannot be assessed through a formal CVSS score (CVE pending), but the relevant characteristics are: network-accessible, no authentication required, low attack complexity, affects a user base in the hundreds of millions, and the attacker-side prerequisite is only a publicly accessible web page. Informal estimates from security researchers place the risk in the High range.

// 03 Who Is Affected

Any user of ChatGPT who uses the web summarization feature — asking ChatGPT to "summarize this page" or to answer questions about a linked URL — is potentially exposed to ChatGPhish. This includes:

Individual users of ChatGPT Free, Plus, and Enterprise plans
Developers using the ChatGPT interface for research and documentation review
Security analysts who use ChatGPT to triage threat intelligence or review external reports
Organizations where employees use ChatGPT to process or summarize external web content

The attack is particularly relevant in enterprise environments where employees routinely ask ChatGPT to summarize vendor documentation, security advisories, or third-party reports — page types that an attacker with knowledge of the target organization could craft to deliver the payload.

// 04 What You Should Do Right Now

Until OpenAI patches the underlying trust model:

Treat all links rendered inside ChatGPT summaries as potentially attacker-controlled. Do not click links in AI-generated summaries of external pages without independently verifying the destination URL.

Do not ask ChatGPT to summarize pages with user-generated content — GitHub repositories, forum threads, Wiki pages, or any content where an attacker could embed Markdown before your request.

Block external image auto-fetching at the browser level or via enterprise proxy rules targeting chatgpt.com's external image requests, to prevent passive IP harvesting without user interaction.

Do not scan QR codes from AI-generated summaries on mobile devices. QR code content in AI responses is not validated and bypasses desktop URL filtering.

If you are an enterprise administrator, consider restricting ChatGPT web summarization functionality via ChatGPT Enterprise controls until OpenAI confirms a patch. Treat any AI summary of external content as untrusted output from the perspective of link safety.

// 05 Background: Understanding the Risk

ChatGPhish belongs to a class of attacks called prompt injection — where an attacker's content is processed by an AI model and treated as trusted instructions or trusted output, rather than as external, potentially hostile input.

Prompt injection (the term coined by researcher Simon Willison) is to AI systems what SQL injection (where attacker-controlled input is executed as trusted database commands) was to web applications in the early 2000s. SQL injection was dismissed as a fringe concern for years before it became one of the most exploited vulnerability classes in history. Security researchers are watching prompt injection with similar concern.

The specific variant exploited by ChatGPhish — indirect prompt injection via web content — has been documented in AI systems since 2023. Microsoft's Bing AI (now Copilot), Google Bard, and Anthropic's Claude have all faced variations of this class of attack, where attacker-controlled content fetched from the web influences model output in unintended ways.

What distinguishes ChatGPhish is the Markdown rendering trust boundary: most AI assistants that summarize web content strip active links or label them as "from source." ChatGPT's renderer, as described by Permiso, currently renders them as first-class live links — a design choice that turns every web summarization request into a potential phishing vector.

The OpenAI Bugcrowd response — "could not be reproduced," then "duplicate" — is a pattern researchers encounter when reporting systemic architectural issues rather than discrete exploitable bugs. When a vulnerability is rooted in a design decision (trust all Markdown from external sources), reproducing it requires understanding the architecture, not just executing a PoC. The "duplicate" classification may indicate that the issue has been previously reported, but the absence of a patch or public CVE suggests it has not been prioritized for remediation.

// 06 Conclusion

ChatGPhish is a real, unpatched attack surface affecting every ChatGPT user who summarizes external web content. Until OpenAI confirms a fix, treat all links and QR codes in AI-generated summaries of external pages as potentially attacker-controlled — and avoid using ChatGPT to process content from sources you do not control. The attack requires no special tools and no prior access to the victim's accounts.

For any query contact us at contact@cipherssecurity.com

Post Views: 4

Team Ciphers Security

The Ciphers Security editorial team — practitioners covering daily threat intel, CVE deep-dives, and hands-on cybersecurity research. About us →

ChatGPhish: Unpatched ChatGPT Flaw Turns Web Summaries Into Phishing Lures

// 01 ChatGPhish: Technical Details

// 02 Exploitation Status and Threat Landscape

// 03 Who Is Affected

// 04 What You Should Do Right Now

// 05 Background: Understanding the Risk

// 06 Conclusion

⚡ Latest News

ChatGPhish: Unpatched ChatGPT Flaw Turns Web Summaries Into Phishing Lures

// 01 ChatGPhish: Technical Details

// 02 Exploitation Status and Threat Landscape

// 03 Who Is Affected

// 04 What You Should Do Right Now

// 05 Background: Understanding the Risk

// 06 Conclusion

Related coverage on Ciphers Security

⚡ Latest News