AI agents & tool use

Agents

Agents let a model plan and call tools to act in the world. The autonomy that makes them useful also makes them dangerous: untrusted input can hijack instructions (prompt injection), tool arguments can be hallucinated, and errors compound across steps. The linked findings show concrete failure modes to test for before deploying an agent on untrusted data.

Findings (5)

Data exfiltration through prompt injection in agentsSafetyCritical
Format-constraint violations under strict schemasTool useMedium
Hallucinated tool/function argumentsTool useHigh
Indirect prompt injection via retrieved contentPrompt injectionCritical
Reasoning model regresses on tool use versus its base modelTool useMedium

Methods

🔬 Chaos engineering for AI systems 🔬 Integration testing (MCP handshakes & tool contracts)🔬 Prompt-injection & jailbreak testing

References

Gorilla: Large Language Model Connected with Massive APIs — arXiv
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection — arXiv
Prompt injection: what’s the worst that can happen? — Simon Willison’s Weblog

Cite this

Qlarify Labs. (2026). AI agents & tool use. Retrieved from https://labs.qlarify.fi/topics/ai-agents-and-tool-use