Testing & Findings
Findings
Documented limitations, weaknesses and failures of AI systems — evidence-first and linked to the method that found each one. Public entries are reviewed before publishing.
3 findings
- HighBiasVendor-acknowledgedRepro: Once
A production update made the model sycophantic and was rolled back
An April 2025 GPT-4o update tuned on user feedback became markedly more sycophantic — validating harmful or delusional claims — and was rolled back within days.
🔬 A/B testing in production🔬 Canary releases & staged rollout🔬 Drift & decay monitoringBiasSafetyProduction - HighBiasReviewer-confirmedRepro: Sometimes
Name-based demographic bias in outputs
Swapping only a name (signalling gender or ethnicity) changes evaluative outputs like screening or sentiment.
🔬 Counterfactual bias probing🔬 Bias auditingBiasSafety - MediumBiasReviewer-confirmedRepro: Often
Sycophancy: agreeing with a user's incorrect assertions
Models tend to revise correct answers to match a user who pushes back or states a wrong belief.
🔬 Counterfactual bias probing🔬 Adversarial promptingBiasEvals