Unique Word Count
Count distinct words and compute a vocabulary-richness ratio (unique / total).
Unique words
0
Total words
0
Type-token ratio
0%
unique / total
Repeated words
0
About Unique Word Count
Unique-word count tells you how many distinct words your text uses. The ratio of unique to total words — sometimes called the type-token ratio — is a basic measure of lexical diversity. Writing that repeats a small core vocabulary has a low ratio; varied writing has a high one.
When to use it
- Measuring vocabulary diversity for an academic or creative piece
- Spotting overuse of a particular word
- Comparing two drafts' lexical richness
- Estimating dictionary effort needed for translation
How it works
Words are tokenized and lowercased for case-insensitive comparison. Identical words are deduplicated via a Set. The ratio is (unique words / total words) × 100.
Examples
The cat sat on the mat. The cat slept.
Unique: 6, Total: 9, Ratio: 66.7%
Frequently asked questions
- Is the count case-sensitive?
- No. 'Cat' and 'cat' count as the same word. The text is lowercased before deduplication.
- Does it account for word forms?
- No. 'run' and 'runs' count as different words. For lemma-aware analysis (treating 'run', 'runs', 'running' as one), use a dedicated NLP tool.
- What ratio is 'good'?
- Depends on length. Short texts naturally hit higher ratios; long texts repeat function words like 'the' and 'and', dragging the ratio down. As a rough guide, technical writing sits around 30–50%, casual prose 40–60%, poetry 60–80%.