DeepSeek
DeepSeek-R1
DeepSeek's open-weight reasoning model, built on V3 via reinforcement learning and released January 2025. Its own technical report is unusually candid about limitations — captured below as vendor-acknowledged findings.
Attribution note. These are documented failure-mode classesobserved across frontier models and grounded in each finding's cited source — their attribution to this specific version is illustrative. Qlarify Labs has not independently reproduced each finding on DeepSeek-R1; per-version confidence requires reproduction (VERIFICATION §2–4). Open any finding to see its source.
Report card
Auto-derived from 3 linked findings (illustrative version attributions — see note above) — worst severity per category.
- Reasoning
- Medium1×
- Tool use
- Medium1×
- Other
- Low1×
Strengths
Strong chain-of-thought reasoning on math and code; open weights and an open training recipe; competitive with closed reasoning models at lower cost.
Known weaknesses
Per DeepSeek's own R1 paper: optimized for English/Chinese (language mixing on other languages), sensitive to prompts (few-shot degrades it), and weaker than V3 on function calling, multi-turn and JSON output (partly restored in R1-0528). See the linked vendor-acknowledged findings.
Findings (3)
- Reasoning model degrades under few-shot promptingMedium
DeepSeek-R1's own paper reports that few-shot prompting 'consistently degrades its performance' and recommends zero-shot — inverting the usual assumption that examples help.
Reasoning - Reasoning model mixes languages on non-English/Chinese queriesLow
DeepSeek-R1 is optimized for English and Chinese and can mix languages mid-output on queries in other languages — its own paper flags this.
Other - Reasoning model regresses on tool use versus its base modelMedium
DeepSeek-R1 falls short of the base DeepSeek-V3 on function calling, multi-turn, complex role-play and JSON output — a reasoning-tuned model trading away tool-use reliability, later restored in R1-0528.
Tool use
Methods that surface these
Related references
Versions tracked
Cite this
Qlarify Labs. (2026). DeepSeek DeepSeek-R1 — known weaknesses. Retrieved from https://labs.qlarify.fi/models/deepseek-r1