Photon
Incredibly fast crawler designed for OSINT.
A fast Python web crawler built for OSINT that automatically extracts URLs, email addresses, API keys, JavaScript endpoints, subdomains, and DNS data from a target website.
Photon is a fast web crawler written in Python, built for OSINT (open-source intelligence gathering). You point it at a website, and it automatically follows links and collects information that might be useful for security research or reconnaissance.
The tool extracts a wide range of data during a crawl: URLs (both on the target site and linked externally), URLs that contain query parameters, email addresses and social media account references, files such as PDFs and images, API keys or authentication tokens left exposed in page source, JavaScript files and the API endpoints buried inside them, subdomain references, and DNS-related data. Results are saved in an organized folder structure and can also be exported as JSON.
A useful feature is the ability to pull historical URLs from archive.org using the --wayback option. This means Photon can start crawling a site using a list of pages that were captured in the past, which sometimes surfaces content that is no longer linked from the live site.
Several plugins extend the core functionality: one integrates with dnsdumpster for DNS data, another connects to the wayback machine, and an exporter handles formatted output. The tool can also be run inside a Docker container if you prefer not to install its dependencies directly.
Photon can be configured in detail: you can set timeouts, add artificial delays between requests, provide starting seed URLs, or filter out URLs that match a pattern you want to skip. The project is licensed under GPL v3.
Where it fits
- Crawl a target website to automatically collect all email addresses and external links for OSINT research.
- Use the --wayback option to pull historical URLs from archive.org and find pages no longer linked on the live site.
- Scan JavaScript files on a site to extract hidden API endpoints and exposed authentication tokens.
- Run Photon in Docker to collect DNS and subdomain data from a target domain without installing dependencies.