gitmyhub

pyWhat

Python ★ 7.2k updated 2y ago

🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

A Python command-line tool that scans any text, file, or directory and identifies what kinds of structured data it contains, emails, API keys, wallet addresses, IP addresses, and dozens of other patterns, in seconds.

Pythonsetup: easycomplexity 1/5

PyWhat is a Python command-line tool that takes any text, file, or directory and tells you what kinds of structured data it contains. Give it a string you do not recognize and it will identify whether it is a cryptocurrency wallet address, an email, an IP address, a credit card number, a social security number, a URL, an API key, or dozens of other patterns. Give it a file and it scans the contents for the same patterns. Give it a directory and it walks through every file recursively.

One common use case shown in the README is analyzing network capture files (.pcap), which are recordings of network traffic captured during an investigation. Instead of reading through raw packet data manually, you can run PyWhat on the file and get back a categorized list of every email address, URL, phone number, and credential it finds, in seconds rather than minutes.

Another use case is security research and bug bounty hunting. If you download a public repository or a website and want to check whether it accidentally contains exposed API keys, webhook URLs, or other secrets, PyWhat can scan the files and flag anything that matches known patterns for those types of data. You can filter the output to only show results tagged as relevant to bug bounties, cryptocurrencies, AWS credentials, or any other category.

The tool supports filtering by both category tags and a rarity score, which lets you narrow results to either common or unusual matches. Results can be sorted, and output can be exported as JSON for further processing. There is also a Python API for embedding PyWhat into other tools programmatically.

Installation is a single pip command. The project is written in Python and accepts contributions, particularly new regex patterns for recognizing additional types of data.

Where it fits