← Reference library
PaperHigh credibilityICML 2024 (arXiv:2403.06634) · Nicholas Carlini, Daniel Paleka, et al. · March 9, 2024
Stealing Part of a Production Language Model
Our summary
The first precise model-extraction attack on deployed LLMs: from ordinary black-box API access it recovers the final embedding projection layer and the hidden dimension of production models, extracting them from OpenAI's Ada and Babbage for under $20 (with OpenAI's approval and subsequent mitigation).
Why it matters
Establishes that a deployed model's API is itself a leak surface for proprietary internals — the exact confidentiality risk model-extraction probing is meant to quantify.
Cited by these methods
Related findings (1)
Published June 26, 2026
Cite this
Qlarify Labs. (2026). Stealing Part of a Production Language Model. Retrieved from https://labs.qlarify.fi/references/stealing-part-production-lm-2024