← Methods
OtherEmerging

Chaos engineering for AI systems

Deliberately inject failures — tool timeouts, malformed tool responses, truncated context, adversarial inputs — to test whether the system degrades gracefully and recovers.

Published June 26, 2026

How it works

Production is hostile: tools time out, APIs return garbage, context gets truncated, retrieval comes back empty. Chaos engineering injects these faults on purpose and watches how the system copes — does the agent retry sensibly, fail safe, surface a clear error, or loop, stall, and hallucinate its way around the missing data? It targets the parts a happy-path test never reaches: recovery, self-correction, loop avoidance, latency under stress, and the overall user experience when things go wrong.

When to use it

Resilience testing of agentic and tool-using systems; validating retry, fallback, and timeout behaviour; before relying on a system in an environment you don't control.

Limitations

You can only inject the failure modes you anticipate, and running it against anything but an isolated harness risks real disruption. Demonstrates resilience to tested faults, not all of them.

Method yield

Findings
1
Versions spanned
3
Yield score
2
1 Low

Severity-weighted across the published findings below. Why we measure this →

Findings it surfaces (1)

Documented failures this method catches — the evidence it works.

References & further reading

Cite this

Qlarify Labs. (2026). Chaos engineering for AI systems. Retrieved from https://labs.qlarify.fi/methods/chaos-engineering