Remove Non-ASCII Characters
Strip every character outside the ASCII range (0x00–0x7F). Leaves only printable ASCII and control codes.
0 characters
0 characters
About Remove Non-ASCII Characters
Non-ASCII removal strips any character with a code point above 0x7F, leaving only the original 128 ASCII characters: letters, digits, basic punctuation, and control codes. It's useful when piping text into a system that requires strictly 7-bit clean output.
When to use it
- Sanitizing text before storing in an ASCII-only database column
- Stripping emoji and other non-ASCII content for plain-text export
- Preparing input for a legacy system that doesn't handle Unicode
- Removing zero-width and combining characters in bulk
How it works
A regex matches any character outside the range U+0000–U+007F and removes it. Accented Latin letters, emoji, CJK, and combining marks all go.
Examples
Accented letters and emoji are removed — combine with remove-accents first to preserve the letter shape
Café naïve 🎉 résumé
Caf nave rsum
Frequently asked questions
- Will I lose accented letters?
- Yes — accented letters are above 0x7F and get removed. For a better result with accented Latin text, run remove-accents first to convert é → e, then this tool to strip any remaining non-ASCII.
- Are control characters preserved?
- Yes. ASCII control characters (U+0000–U+001F) are inside the ASCII range and stay. To remove those too, follow up with a custom regex strip.
- Is this the same as remove-emojis?
- Stronger. This removes everything non-ASCII; remove-emojis removes only the pictograph code points. Use remove-emojis if you want to keep accented letters and CJK.