PaperHigh credibilityarXiv · Cheng et al. · March 1, 2024

Dated Data: Tracing Knowledge Cutoffs in Large Language Models

Our summary

Probes what models actually know to estimate their *effective* knowledge cutoff, and finds it frequently diverges from the cutoff the developer reports — a consequence of deduplication and temporally mixed web-crawl data.

Why it matters

Confirms a model's sense of when its knowledge ends (and of the current date) is unreliable, so recency- and date-sensitive answers can't be trusted at face value.

Related findings (1)

Confusion about knowledge cutoff and current dateLow
Models misstate their own knowledge cutoff or the current date, and answer about post-cutoff events with stale or invented information.

Hallucination Evals Benchmarks

Published June 26, 2026

Cite this

Qlarify Labs. (2026). Dated Data: Tracing Knowledge Cutoffs in Large Language Models. Retrieved from https://labs.qlarify.fi/references/dated-data-knowledge-cutoffs-2024