HTML Entity Encoder
Escape characters that have special meaning in HTML: <, >, &, ", '. Safe-by-default.
0 characters
0 characters
About HTML Entity Encoder
HTML entity encoding replaces characters that have special meaning in HTML markup with their named or numeric entity equivalents. The five characters that absolutely must be escaped to avoid breaking out of attribute values or element content are <, >, &, ", and '. This is the same escaping browsers apply to safe-printed user input.
When to use it
- Producing user-supplied text that's safe to inject into HTML
- Escaping characters before storing in an XML or HTML document
- Preparing snippets of code for display inside <pre> blocks
- Avoiding XSS issues in server-rendered output
How it works
A single-pass regex replaces each of the five reserved characters with its named entity (&, <, >, ", '). Other characters are left alone. The encoding is reversible via html-entity-decode.
Examples
<script>alert("xss")</script><script>alert("xss")</script>
Frequently asked questions
- Are non-ASCII characters encoded?
- No — only the five reserved characters are touched. Modern HTML accepts UTF-8 directly, so accented letters and emoji don't need encoding. For numeric entity output, use the unicode-escape tool.
- Why is ' encoded as ' instead of '?
- ' is XHTML-only — older HTML processors don't recognize it. ' is the numeric reference, which works everywhere.
- Does this prevent all XSS?
- It handles the most common cases (content inside elements and attributes). Defense in depth still matters: use a Content Security Policy and a templating engine that auto-escapes by default.