Testing & Findings
Findings
Documented limitations, weaknesses and failures of AI systems — evidence-first and linked to the method that found each one. Public entries are reviewed before publishing.
2 findings
- CriticalPrompt injectionReviewer-confirmedRepro: Sometimes
Indirect prompt injection via retrieved content
Instructions hidden in documents, web pages or tool outputs can override the system prompt when ingested by the model.
🔬 Prompt-injection & jailbreak testingPrompt injectionSafetyRAG - MediumReasoningReviewer-confirmedRepro: Often
Lost in the middle: degraded recall for mid-context information
Retrieval accuracy is highest for facts at the start and end of a long context and drops for facts in the middle.
🔬 Needle-in-a-haystack (long-context retrieval)🔬 Boundary & edge-case testingRAGContext window