Quantcast
Channel: Radical Development » Penetration Testing
Viewing all articles
Browse latest Browse all 8

Metagoofil makes metadata extraction easy

$
0
0

Metagoofil is an information gathering tool designed for extracting metadata of public documents (pdf, doc, xls, ppt, docx, pptx, xlsx) belonging to a given target or victim website. The tool will perform a search in Google to identify and download the documents to local disk and then will extract the metadata with different libraries like Hachoir, PdfMiner and others. With the results it will generate a report with usernames, software versions and servers or machine names that will help Penetration testers in the information-gathering phase.

metadata

Metadata serves five purposes: resource description; information retrieval; management of information; rights management, ownership and authenticity; and interoperability and e-commerce. I can think of no better way to summarize what metadata is better than Wikipedia’s explanation, which defines Metadata as “data about data”. The simplest way to think about metadata is use the file properties of an electronic document. In this case, I will use an example of a PowerPoint file that provides a wealth of additional information as long as you know where to look. Consider the following example that contains a title, subject, author, manager, and keywords.

PowerPoint metadata

Go grab Metagoofil and get busy

Before you get to far ahead of yourself you need to understand that this type of intelligence gathering is active rather than passive. That said jump over and grab the source code for Metagoofil. If you are running OS X Subversion ships with the operating system as part of the numerous open source projects. Open up terminal and run the following command:

svn checkout http://metagoofil.googlecode.com/svn/trunk/ metagoofil-read-only

Otherwise you will need to install a Subversion client that fits your needs and environment.

At this point you are ready to employ the capability provided by Metagoofil. There are seven options that you may use in order to accomplish the task at hand. These options are:

  • -d: domain to search
  • -t: filetype to download (pdf,doc,xls,ppt,odp,ods,docx,xlsx,pptx)
  • -l: limit of results to search (default 200)
  • -h: work with documents in directory (use “yes” for local analysis)
  • -n: limit of files to download
  • -o: working directory (location to save downloaded files)
  • -f: output file

Now you need to select your target and options.

metagoofil.py -d microsoft.com -t pdf -l 25 -n 25 -o msdndata -f msdnreport.html

Sit back and allow for the metadata to rush in. If all goes well you will seem something similar to the following:

* Metagoofil Ver 2.2
* Christian Martorella
* Edge-Security.com
* cmartorella_at_edge-security.com
['pdf']

[-] Starting online search…

[-] Searching for pdf files, with a limit of 25
Searching 100 results…
Results: 106 files found
Starting to download 25 of them:
—————————————-

Once your search has completed not only do you have the documents you were seeking, but you also have a wealth of additional information, again the metadata, which can be very useful. To a large extent I would argue that this metadata could be damaging should black hats use it. Of course, this same metadata has valid use cases in the course of everyday business. Again it comes down to understanding the risk and balancing the needs of the business against the defined risk.

Metagoofil Report

In my example the target was Microsoft and here I obtained 65 email addresses, 10 usernames, and 31 software programs to include the version numbers. Stepping back for a moment and rationalizing this metadata, it is not at all difficult to understand how this information may be used in a negative fashion.

Conclusion

There are dangers and benefits of metadata and it is clearly up to the data owners to define how much information is available and to whom as clearly demonstrated in this article. Metagoofil is a great tool that you may be interested in adding to your penetration testing toolbox. If you want to learn more about the dangerous of metadata then give the following talk a watch from Robert Reed from the 2014 ShowMeCon event.


Viewing all articles
Browse latest Browse all 8

Trending Articles