Automating #HASH check using python

We all know that the pace at which technology is advancing, automation has become crucial in our daily lives. Specially in the field of security where attackers are becoming heavily automated, its best for organizations to move to a more automated mindset when defending against them. Trying to defend against these attacks manually, more or less seems like hitting a wall as by the time you “type to deploy” something the attack patterns change. In such a scenario the whole idea of automation is to save “time” and “effort”, which as a result will help any analyst strategize, and come up with immediate optimal solutions.

I will be starting a whole new series of small scripts. Might be meaningless to a few but may be useful to others. The idea is to promote automation, focusing more on security, the field I love so much.

Lets face it, the machines will take over eventually, but that’s a whole different topic all together.

Any questions regarding the concepts are more than welcome in the comment section.

In this particular post I came up with the Idea of checking hash values via python scripting.

Concept: This script shows how a basic search function works and in this case the script looks into every PDF file in a specific folder and gives back the results explaining where the specified string is. The script is also capable of querying VirusTotal, which if you are in the security community a lot of use to lookup hash values, files, URL’s and IP addresses.

How I got the idea ?

Imagine you are in a SOC, you get an incident in which you get a list of hash values of some malicious files.

Scenario 1: you have a large dump of Threat Intel PDF files stored somewhere and you want to lookup a particular hash. Enter it in the script and voila you get the results telling you which page number of which file has information related to the hash or the string entered.

Scenario 2: You don’t want to open VirusTotal in a browser every now and then to look up the hash. Run the .bat file of this script, enter the hash there and immediately you will get the results.

Future add-ons for this script

I plan on adding more functionality to this script where in it can query other sources and not just VirusTotal and give much better results.You will be able to search more items like:

  1. Multiple hash values as input
  2. IP addresses
  3. Submit files for analysis via better GUI
  4. Look up URL’s and get an output in a much better, and easy to understand format.

And yes, the script is buggy at the moment, but it will be optimized in other revisions.

The complete source code is on GitHub here.


Here’s how this code will work:

  1. Download the files from the GitHub link above.
  2. Make sure you have python installed along with the necessary libraries installed as mentioned in the “What else you need to know” section.
  3. Open the “Hash_Check.bat” in notepad and change the the path to where you have kept the “hash_check.py” file.
  4. Open the “hash_check.py” file and change the following to where you reports are kept.

  5. Enter the location where you want the JSON output.

What else you need to know ?

Getting the API key

Get your API key from VirusTotal. Once you create an account you should be able to obtain it from the settings tab.

Once you get the key you can enter it in the highlighted section below.

The PyPDF2 module

It is a python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files.

PyPDF2 is not built-in module, so you will need to install it in your python modules separately. To install it, run the following at command line

pip install PyPDF2

More information can be found here.

The requests module

requests isn’t a built-in module either and to install it , type the following:

pip install requests

Requests is a Python module that you can use to send all kinds of HTTP requests. It is an easy-to-use library with a lot of features ranging from passing parameters in URLs to sending custom headers and SSL Verification.

More information can be found here

Using the API and storing results in JSON

Using the requests command we can basically pass the API and the hash value we just received from the user as input. We assign these values through a params object, which is represented in the form of a dictionary:

params = {
'apikey': user_api_key,
'resource': user_hash
     } 

Once everything is in place you can then make the call to the API via the following:

try:
    response = requests.get('https://www.virustotal.com/vtapi/v2/file/report', params=params)
except:
    print('API connection issue')

The response from this call can be stored in the JSON format.

jsonData = response.json()

This will enable us to parse the JSON data for specific value, for example getting the value of “response_code” and storing it in a variable:

response = int(jsonData.get('response_code'))

Here are some screenshots for your reference: Directory structure Menu layout Option 1

Option 2

Hope you enjoyed reading the article.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑

%d bloggers like this: