Probable-Wordlists
Version 2 is live! Wordlists sorted by probability originally created for password generation and testing - make sure your passwords aren't popular!
A huge collection of real-world passwords sorted by how common they are, built from billions of leaked credentials, useful for security research and making sure your own password is not one millions of others use.
This repository is a large collection of password wordlists sorted by how common each password actually is, rather than alphabetically. The core idea is simple: if you know which passwords millions of real people are using, you can make sure yours is not one of them. The project was built using data from publicly available password leaks and is intended strictly for lawful, ethical, and educational purposes.
The author spent the better part of a year gathering nearly 1,600 files totaling more than 350 gigabytes of leaked credentials from sites like SecLists, Weakpass, and Hashes.org. Each file was cleaned up, internal duplicates were removed, and all of them were combined into one giant source. A password had to appear in at least five of those source files to make the final cut. The frequency with which a password appeared across all sources was treated as a measure of its popularity. The final output covers roughly two billion real passwords, sorted from most to least common.
The repository is organized into three main sections. The first is Real-Passwords, which contains actual leaked passwords, including a subfolder for entries between 8 and 40 characters useful for wireless network testing. The second is Dictionary-Style Lists, which includes general-purpose word collections, common usernames, top-level domains, and other technically useful entries. The third is Analysis Files, which holds tools like HashCat rules and character masks generated using the PACK project, useful for people doing password recovery or security research.
This is not a code project. The repository does not contain software to run but rather serves as a reference library of files you can download selectively. A full clone is not recommended because of the size involved; the project includes a separate downloads page that helps you get only what you need.
The project has been referenced in published books on password cracking, mentioned on the Security Now podcast, and cited by tools like Aircrack and L0phtcrack. Its main value is educational: it makes visible the patterns that weak passwords follow, so anyone who wants to pick a genuinely hard-to-guess password can see what to avoid.
Where it fits
- Check if your password pattern is dangerously common by looking it up in the frequency-sorted lists
- Use the 8-to-40-character wireless subset as a wordlist for authorized Wi-Fi security testing with Aircrack-ng
- Apply the included HashCat rules and character masks to authorized password recovery research
- Study which password patterns appear most often to teach others how to pick genuinely strong passwords