Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Our summary
Demonstrates indirect prompt injection against real LLM-integrated applications: adversarial instructions hidden in web pages, emails, or other retrieved content hijack the model when it later processes them — no access to the prompt required. Catalogs concrete attacks (data theft, manipulation) on tool- and retrieval-connected systems.
Why it matters
The academic foundation for the highest-severity agent risk — once a model ingests untrusted content, that content can issue commands. Complements Willison's practitioner framing.
Cited by these methods
Related findings (2)
- Indirect prompt injection via retrieved contentCritical
Instructions hidden in documents, web pages or tool outputs can override the system prompt when ingested by the model.
- Data exfiltration through prompt injection in agentsCritical
An injected instruction can make a tool-using agent send private data to an attacker-controlled destination.
Published June 26, 2026
Cite this
Qlarify Labs. (2026). Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. Retrieved from https://labs.qlarify.fi/references/indirect-prompt-injection-greshake-2023