Image may be NSFW.
Clik here to view.If you working either in forensics or penetration testing you will absolutely come across the need to create a custom word list. You may be thinking to yourself a custom word list is not needed because you have a number of lists that you have created or gathered over the years. I will not argue that have a bag of lists is not needed because I have my own collection as well. I submit to you that if you have a specific target then understanding said target will be useful when it comes to password cracking.
For example, if your target is a big Simpsons fan, then it makes sense to create a word list that maps to keywords amongst those fans. By taking this approach, you may find that you spend less time cracking a password, at least that is the idea. Of course, this means you must know the target somewhat well and to gather this type of intelligence all you need to do is turn to social media or any other internet resource. For those of you who work in security or worry about privacy, you understand the fact that individuals typically share entirely to much detail about themselves publicly.
Running SmeegeScrape
Jump over to GitHub and and grab SmeegeScrape. The beauty of SmeegeScrape is that is is a Python script so there is no need for an install or complex configuration. Just give the help a look and determine what it is that you need to accomplish.
usage: SmeegeScrape.py [-h] (-f LOCALFILE | -d FILEDIRECTORY | -u WEBURL | -l WEBLIST) [-r] [-o OUTPUTFILE] [-i] [-s] [-n] [-min MINLENGTH] [-max MAXLENGTH] optional arguments: -h, --help show this help message and exit -f LOCALFILE Specify a local file to scrape. -d FILEDIRECTORY Specify a directory to scrape all supported files. -u WEBURL Specify a url to scrape page content (correct format: http(s)://www.smeegesec.com) -l WEBLIST Specify a text file with a list of URLs to scrape (separated by newline). paramters and options: -r Scan directories recursively (only applies when used with -d) -o OUTPUTFILE Output filename. (Default: smeegescrape_out.txt) -i Remove integers [0-9] from the final output. -s Remove special characters (only alphanum allowed) from the final output. -n Remove all non-printable ASCII characters from the final output. -min MINLENGTH Set the minimum number of characters for each word (Default: 3). -max MAXLENGTH Set the maximum number of characters for each word (Default: 30).
For my example, I am going to generate a word list that you are free to download and use from an unnamed source.
Image may be NSFW.
Clik here to view.
My word list resulted in 803,486 words, which in itself it not to bad. Of course, you can target multiple resources and possibly end up with millions of words.
The point here is there is always a tool for the job and SmeegeScrape is one such tool. I would be interested in hearing from the community on what tools are preferred in generating a custom word list.