← Models & AI tech

Anthropic

Claude Sonnet

Anthropic's balanced Claude tier — strong capability at lower latency and cost than Opus. Linked findings reflect documented frontier failure-mode classes; per-version attribution is illustrative.

Attribution note. These are documented failure-mode classesobserved across frontier models and grounded in each finding's cited source — their attribution to this specific version is illustrative. Qlarify Labs has not independently reproduced each finding on Claude Sonnet; per-version confidence requires reproduction (VERIFICATION §2–4). Open any finding to see its source.

Report card

Auto-derived from 7 linked findings (illustrative version attributions — see note above) — worst severity per category.

Hallucination
High1×
Reasoning
Medium4×
Refusal
Medium1×
Bias
Medium1×

Strengths

Fast, capable general reasoning and coding; good instruction-following and comparatively calibrated refusals for its tier.

Known weaknesses

Shares the frontier-wide arithmetic, counting and tokenization limits; susceptible to sycophancy under user pressure and to prompt injection in agentic settings.

Findings (7)

Methods that surface these

Related references

Versions tracked

claude-sonnet-4-6

Cite this

Qlarify Labs. (2026). Anthropic Claude Sonnet — known weaknesses. Retrieved from https://labs.qlarify.fi/models/claude-sonnet