← Reference library
PaperHigh credibilityFindings of ACL 2022 · Parrish et al. · May 1, 2022
BBQ: A Hand-Built Bias Benchmark for Question Answering
Our summary
A hand-built benchmark probing social bias in QA across nine dimensions: models fall back on stereotypes when context is under-specified, and are more accurate when the correct answer happens to match a stereotype.
Why it matters
A rigorous, counterfactual approach to bias probing — vary only the protected attribute and watch the answer change.
Cited by these methods
Related findings (1)
Published June 26, 2026
Cite this
Qlarify Labs. (2026). BBQ: A Hand-Built Bias Benchmark for Question Answering. Retrieved from https://labs.qlarify.fi/references/bbq-bias-benchmark-2022