TextyConverterbeta
⌘K

Extract Hashtags

Pull #hashtags out of social-media posts, captions, and other text. Unicode-aware.

0 hashtags found

About Extract Hashtags

The hashtag extractor scans text for #-prefixed tokens that look like hashtags. The matcher accepts Unicode letters, digits, and underscores after the # — so it works for English, Latin-extended, Cyrillic, CJK, and emoji-adjacent hashtags. Use 'Unique only' to deduplicate when scraping multiple posts.

When to use it

  • Collecting hashtags from a batch of social-media captions
  • Auditing a post for tag count and reach
  • Building a hashtag library from inspiration posts
  • Counting unique tags across a content campaign

How it works

The regex /#[\p{L}\p{N}_]+/gu matches a # followed by one or more Unicode letters, digits, or underscores. Whitespace and punctuation terminate the match.

Examples

Loving this! #coffee #morning #wfh #日本
#coffee
#morning
#wfh
#日本

Frequently asked questions

Are emoji included in the tag?
Generally no — emoji aren't in the \p{L} category. A tag like #love❤️ matches only #love.
Does it handle non-Latin scripts?
Yes. The Unicode property \p{L} covers letters in all scripts. Japanese, Arabic, Cyrillic, and other tags work correctly.
What about Twitter/X hashtag rules specifically?
X allows letters, digits, and underscores. The extractor matches the same character set, so the output should mirror what X parses out of a post.

Related tools