Models miscompute differences between dates, weekdays, and durations across boundaries like months and leap years.
Published June 26, 2026
Reproducibility
Often
Severity
Low
Confidence
Reviewer-confirmed
Details
Questions like 'how many days between two dates' or 'what weekday was X' produce frequent off-by-N errors, especially across month/year boundaries and leap years. Reliable only when delegated to a tool.