Glitch-token & unicode fuzzing
Feed anomalous tokens, rare unicode, homoglyphs and malformed encodings to trigger out-of-distribution behavior.
Published June 26, 2026
How it works
Certain under-trained tokens and unusual unicode sequences cause models to emit nonsense, ignore instructions, or bypass filters. Fuzzing the input encoding surfaces both reliability glitches and a real safety-bypass surface (homoglyph obfuscation of disallowed content).
When to use it
Robustness hardening; safety-filter evaluation; input-sanitization design.
Limitations
Findings can be version-specific and ephemeral as tokenizers change.
Method yield
- Findings
- 3
- Versions spanned
- 5
- Yield score
- 8
Severity-weighted across the published findings below. Why we measure this →
Findings it surfaces (3)
Documented failures this method catches — the evidence it works.
- Anomalous behavior on glitch tokensLow
Certain under-trained tokens cause models to emit nonsense, evade instructions, or behave erratically.
How it found it: Sweep rare token IDs; flag anomalous completions.
Other - Safety bypass via unicode/homoglyph obfuscationHigh
Disallowed content encoded with look-alike unicode or spacing can slip past safety filters.
Jailbreak - Repetition and degeneration loopsLow
Under certain prompts or long generations, models fall into repeating phrases or degenerate text.
Other
References & further reading
Cite this
Qlarify Labs. (2026). Glitch-token & unicode fuzzing. Retrieved from https://labs.qlarify.fi/methods/glitch-token-fuzzing