Remove Accents & Diacritics
Strip accent marks from letters — café → cafe, naïve → naive. Unicode-aware via NFD normalization.
0 characters
0 characters
About Remove Accents & Diacritics
Accent removal decomposes each accented character into its base letter plus combining marks (Unicode NFD form), then strips the combining marks. The result is the closest plain-ASCII approximation of the original text — useful for slugs, search keys, and fuzzy matching.
When to use it
- Producing URL-friendly slugs from accented titles
- Normalizing names for case-insensitive search or matching
- Cleaning text before storing it in an ASCII-only database column
- Stripping combining marks left over from a zalgo-style decoration
How it works
The text is normalized to NFD (decomposed form), which separates accented letters into base + combining marks. A regex then removes all combining marks (\p{M}), leaving just the base letters. Characters without a decomposable form (ß, æ, ø) pass through unchanged.
Examples
Café naïve résumé jalapeño
Cafe naive resume jalapeno
Frequently asked questions
- Are special letters like ß and æ converted?
- No. Only letters that can be decomposed into base + accent are simplified. ß stays as ß; æ stays as æ. To handle these, do a manual replace after the accent strip.
- Are non-Latin scripts affected?
- Combining marks in any script are removed. Greek accents, Hebrew points, Arabic harakat — all stripped. The base letters of those scripts are preserved.
- Is this the same as ASCII transliteration?
- No — accents are removed, but characters that aren't accented stay as-is. For full transliteration (Cyrillic, Chinese, etc. → ASCII), use a library like unidecode.