MediumOtherVendor-acknowledgedPublished

Vendor cautions its reasoning model's chain-of-thought may be unfaithful

OpenAI's o1 system card states its chain-of-thought 'may not be fully legible and faithful… even now' — the developer itself warns the displayed reasoning can't be trusted as the real cause.

Published June 26, 2026

Reproducibility: Sometimes
Severity: Medium
Confidence: Vendor-acknowledged

Details

In the o1 System Card, OpenAI writes that while excited about chain-of-thought monitoring, 'we are wary that they may not be fully legible and faithful in the future or even now.' A vendor explicitly cautioning that its model's displayed reasoning may not reflect the actual computation — the exact concern chain-of-thought faithfulness probing targets, here conceded by the model's own makers.

Found with

🔬 Chain-of-thought faithfulness probing

Perturbing the reasoning trace and watching whether the answer tracks it is how to measure the (un)faithfulness the vendor concedes.

Evidence

https://openai.com/index/openai-o1-system-card/

OpenAI, 'o1 System Card' (2024), chain-of-thought safety discussion.

Affected versions

OpenAI · o1

References

OpenAI o1 System Card

Reasoning failure Evals

Source: https://openai.com/index/openai-o1-system-card/

Cite this

Qlarify Labs. (2026). Vendor cautions its reasoning model's chain-of-thought may be unfaithful. Retrieved from https://labs.qlarify.fi/findings/o1-chain-of-thought-unfaithful-vendor