Extract URLs
Pull every http:// and https:// URL out of a block of text. Unique-only and sorted output options.
0 urls found
About Extract URLs
The URL extractor scans your text for http:// and https:// links and returns them as a list. Trailing punctuation (periods, commas, brackets) is stripped so the URL itself stays clean. Common use cases: pulling links out of email digests, log files, or pasted webpages.
When to use it
- Collecting all links from a copy-pasted webpage
- Pulling URLs out of a chat transcript or email digest
- Auditing a document for outbound links
- Building a list of references from a research note
How it works
The regex /\bhttps?:\/\/[^\s<>"'`]+[^\s<>"'`.,;:!?)\]}]/g matches schemes (http or https), followed by any non-whitespace characters, with trailing punctuation excluded from the final character class. Results are emitted in document order; toggle 'Unique only' to deduplicate.
Examples
Trailing commas and periods are stripped
See https://example.com/page, or visit http://test.org.
https://example.com/page http://test.org
Frequently asked questions
- Are URLs without a scheme matched?
- No. The pattern requires http:// or https://. Schemeless URLs like example.com or www.example.com aren't matched — they're hard to distinguish from regular text without false positives.
- What about ftp://, mailto:, file://, etc.?
- Only http:// and https:// are matched. Other schemes are rare in real text — adjust the regex manually if you need them.
- Are URLs validated?
- Only by shape. The tool doesn't fetch or verify reachability. Use a link-checker for that.