← Findings
HighJailbreakReviewer-confirmedPublished

Safety bypass via unicode/homoglyph obfuscation

Disallowed content encoded with look-alike unicode or spacing can slip past safety filters.

Published June 26, 2026

Reproducibility
Sometimes
Severity
High
Confidence
Reviewer-confirmed

Details

Replacing characters with visually similar unicode (homoglyphs) or inserting zero-width characters can cause input filters and the model to mis-handle disallowed requests. Both a robustness and a safety-bypass surface.

Found with

Evidence

Homoglyph-substituted request bypassed a keyword filter and was partially answered. Working payload withheld.
Illustrative example — see the linked reference for the documented evidence.

1 evidence item withheld. Live exploit payloads are not published — only the technique and impact are described (disclosure policy).

Affected versions

Anthropic · claude-opus-4-8OpenAI · gpt-4oGoogle · gemini-2.0-flashMeta · llama-3.3-70b

References

Source: https://arxiv.org/abs/2307.15043

Cite this

Qlarify Labs. (2026). Safety bypass via unicode/homoglyph obfuscation. Retrieved from https://labs.qlarify.fi/findings/homoglyph-safety-bypass