TextyConverterbeta
⌘K

Detect Language

Identify the language of a text sample. Uses statistical n-gram matching across hundreds of languages.

0 characters
0 characters

About Detect Language

Language detection works by counting character n-grams in your text and comparing the distribution to known-language profiles. This tool uses franc-min, which covers the world's most-spoken languages and runs entirely in your browser. Longer samples yield more accurate results — a single short word may be ambiguous.

When to use it

  • Routing user-submitted content to the right localization team
  • Tagging documents by language for archival
  • Pre-flighting input before sending to a language-specific NLP tool
  • Inspecting an unfamiliar text to identify what it is

How it works

The text is analyzed by franc-min, which computes trigram (3-character) frequencies and compares them against profiles for ~80 of the most-spoken languages. The top candidates are returned with confidence scores; if confidence is too low (very short input or mixed languages), 'undetermined' is shown.

Examples

Bonjour, comment allez-vous?
French (fra) — high confidence

Frequently asked questions

How long does the input need to be?
Around 50+ characters works reliably. Very short phrases (a single word) often produce 'undetermined' or wrong guesses, since n-gram statistics need data.
Are all languages supported?
franc-min covers ~80 of the most-spoken languages. The full 'franc' package covers ~400+ but is much larger; franc-min is enough for almost all real-world content.
Why ISO 639-3 codes?
ISO 639-3 (three-letter codes) covers more languages than the older two-letter ISO 639-1. This tool shows both the code and the English language name.

Related tools