BlogHigh credibilitySimon Willison’s Weblog · Simon Willison · April 14, 2023

Prompt injection: what’s the worst that can happen?

Our summary

An accessible explanation of why prompt injection is hard to fix: once an LLM agent processes untrusted content, that content can hijack its instructions. Walks through concrete exfiltration and abuse scenarios for tool-using assistants.

Why it matters

The clearest practitioner-level framing of the risk that makes autonomous agents dangerous to deploy on untrusted input.

Cited by these methods

🔬 Prompt-injection & jailbreak testing

Related findings (2)

Indirect prompt injection via retrieved contentCritical
Instructions hidden in documents, web pages or tool outputs can override the system prompt when ingested by the model.
Data exfiltration through prompt injection in agentsCritical
An injected instruction can make a tool-using agent send private data to an attacker-controlled destination.

Prompt injection Safety Agents

Published June 26, 2026

Cite this

Qlarify Labs. (2026). Prompt injection: what’s the worst that can happen?. Retrieved from https://labs.qlarify.fi/references/prompt-injection-worst-that-can-happen