Welcome to pan-tort’s documentation

Overview

Palo Alto Networks ‘Test Output Research Tool’ aka pan-tort

Pan-Tort is designed to help automate data capture for a list of MD5, SHA1, or SHA256 hashes that may have failed a security performance test.

What pan-tort does

Instead of manually digging through various tools like Threatvault, Wildfire, or Autofocus to get contextual data about the hashes, the user can input the list of hashes and have pan-tort query Autofocus for the following items:

  • Wildfire verdict (malware, phishing, grayware, or benign)
  • Hash file type
  • Autofocus malware family, actor, and campaign tag associations
  • Wildfire/AV signature names and status
  • DNS signature names and status

The responses are output in 3 ways:

  1. Simple console stats summary when the run is complete
  2. Elasticsearch load-ready json format with index attributes
  3. Readable ‘pretty json’ output format to scan through query results

Impacts to Autofocus API rate limits

Pan-tort uses the Autofocus API which rate limits using a point system. Each lookup uses points that count against per-minute and daily totals.

The maximum number of daily points per Autofocus API key is based on the license purchased. A standard license can use up to 5,000 points per day while an unlimited license provides 100,000 points per day. All licenses are limited to 200 points per minute.

The queries are designed to be low impact with the initial hash search for tag and file type data done as a single bulk search. Initializing a query is 10 points, each subsequent check for results 1 point. Searches for signature coverage use 2 points, 1 each for query and response.

As example, a typical pan-tort run with 100 hashes will use about 215 daily points. Per-minute point totals typically range from 20-30 points based on query-response times.

Sample output views

Summary Stats

If running from a terminal console, at the end of the run, a short summary of key stats is shown.

_images/summary_stats.png

Values in the summary stats:

  • Total samples queried: the number of hashes in the input list

  • Samples not found in Autofocus: the hash value is not found in Autofocus

  • Verdicts: based on Wildfire analysis verdict results

  • Signature coverage for malware verdicts

    • active: There is a WF/AV sig currently loaded in the firewall
    • inactive: A signature has been created and is not currently loaded in the firewall
    • no sig: no signature history for this file sample

Kibana dashboard

The Kibana dashboard provides a more interactive view of the output data.

Pan-tort includes importable json elements for Kibana. Users can then extend visualizations and dashboard as desired using the same source data.

_images/dashboard.png

Text json output

For quick analysis or sharing, there is a pretty-format json file with all results data. This detailed data extends beyond the summary stats to include malware family and groups, file types for each sample, and signature details such as threatname, DNS domains, and create dates.

_images/pretty_json.png

Prerequisites and Installation

These instructions are for a stand-alone install from github to run locally.

Prerequisites

The following requirements must be met before installing and using pan-tort.

Autofocus API Key

Ensure you have an active Autofocus subscription and API key.

Get your Autofocus API key

This key will be used below after pan-tort is installed.

Python, virtual environment, and pip

The code in pan-tort requires python 3.6 or later. The examples will use python 3.6.

The examples also show python running in a virtual environment with pip used to install required packages.

Python 3.6 virtual environment documentation

In most cases pip is already installed if using python 3.6 or later.

Checking the pip version:

$ pip --version
pip 18.0 from /Users/localuser/pan-tort/env/lib/python3.6/site-packages/pip (python 3.6)

pip information and installating instructions

Once these requirements are met you are ready to install pan-tort.

Installation

The initial steps are an overview to clone the repo and activate a python virtual environment.

$ git clone git@github.com:PaloAltoNetworks/pan-tort.git
$ cd pan-tort
$ python3.6 -m venv env
$ source env/bin/activate
(env)$ pip install -r requirements.txt

The virtual environment name is env and if active will likely be shown to the left of the command prompt. If successful, the pan-tort utility is installed and almost ready to use.

Autofocus API key

Once you have the api key, it will be used to create the af_api.py key file in the hash directory. Any text editor can be used to create this file.

api_key = '{your api key  goes here}'

Save the file as hash/af_api.py.

The hash_list.txt file

This is the list of hashes used for the pan-tort query. There is no limit to the file size. Pan-tort will segment the list automatically if more than 1,000 hashes are to be searched.

The hash file is a simple text file with one hash per line. A sample hash file to edit is in the hash directory. These are md5 hashes.

Editting conf.py

The conf.py file has default values for variables used in pan-tort. One value that may need to be edited is the hashtype variable. Make sure this value matches the hash type of the samples in the hash list.

hashtype = 'md5'

Get the latest Autofocus malware tag data

The output data as context specific to malware family tags. To get the latest list required for pan tort, run the gettagdata.py file in the hash directory.

$ python gettagdata.py

This will take less than a minute and the output will be tagdata.json in the hash directory.

Run this utility periodically to ensure pan-tort has the latest tag data.

Using pan-tort

The prior installation steps ensure pan-tort is ready to run and can communicate with Autofocus.

Running a query

To run a query with the existing hash list, run hash_data_plus.py. You will be prompted for a name to use for this query. This name is used to filter results in Kibana and to append to the json output files for easy reference.

$ python hash_data_plus.py
Enter brief tag name for this data: some tag for this query

At this point pan-tort will run in 2 stages:

  • Initial Autofocus sample search to get verdict, file type, and malware data
  • Per-hash signature coverage search for each hash found in Autofocus

The intial sample search will send up to 1,000 samples per query and capture responses. This search should only take 1-2 minutes for smaller hash list sizes.

Signature coverage queries are 1 sample per lookup and take approximately 3-8 seconds based on response time.

A 100 sample run can take from 5-10 minutes on average.

Viewing the results

Summary Stats

If running from a terminal console, at the end of the run, a short summary of key stats is shown.

_images/summary_stats.png

Values in the summary stats:

  • Total samples queried: the number of hashes in the input list

  • Samples not found in Autofocus: the hash value is not found in Autofocus

  • Verdicts: based on Wildfire analysis verdict results

  • Signature coverage for malware verdicts

    • active: There is a WF/AV sig currently loaded in the firewall
    • inactive: A signature has been created and is not currently loaded in the firewall
    • no sig: no signature history for this file sample

Kibana dashboard

The Kibana dashboard provides a more interactive view of the output data.

Pan-tort includes importable json elements for Kibana. Users can then extend visualizations and dashboard as desired using the same source data.

_images/dashboard.png

Text json output

For quick analysis or sharing, there is a pretty-format json file with all results data. This detailed data extends beyond the summary stats to include malware family and groups, file types for each sample, and signature details such as threatname, DNS domains, and create dates.

_images/pretty_json.png

Interpreting the Results

The purpose of pan-tort is to add context to the output for misses during a security test. Similar interpretations can be used for any type of data fed into pan-tort.

For this example we will use the sample data and Kibana dashboard view.

_images/dashboard.png

Sample and Verdict counts

The first level of data is samples found and file verdicts. This gives an indication of what files may or may not have signature coverage.

In the dashboard there are 100 total samples.

The verdicts in the pie chart include:

  • 89 malware
  • 7 grayware
  • 4 sample not found

The malware count of 89 is displayed under the total sample count.

At this stage the 11 non-malware sample misses can be accounted for.

Non-malware samples found

For non malware or phishing verdict of benign and grayware, signatures are not typically available.

Therefore, any grayware or benign verdict samples are expected to be part of a test miss.

No sample found

In some cases a hash sample lookup will return no results. This means that the sample doesn’t no exist in Autofocus.

The most likely cause for this type of response is an unsupported file type. The test environment may include file types that are not captured and analyzed by Wildfire.

Malware Signature status

For sample with a malware or phishing verdict, signatures are typically created and mapped to the hash sample.

The second row of the dashboard shows a breakdown of signature status for the 89 malware samples.

  • 63 inactive signatures
  • 19 active signatures
  • 7 no signature

Inactive signatures

In real world deployments, signatures are rotated in the system thus aging out and being replaced by newer malware threats.

This results in a set of signatures that have been created and were released in the past but are now dormant. These are the inactive signatures.

If the test environment is using outdated samples not representative of real world threat activities, there is a high probability that these signatures have been pulled and no longer active. This is a common occurrence.

Active signatures

A miss for active signatures may indicate an update is needed to the signature associated to the file sample.

These can require working with the POC or TAC teams to capture the hash and threatname values to validate the active state and validity of the signature.

No signature

Not all malware verdict samples will have an associated signature. A variety of factors including false positive concerns may lead to a signature not being created.

Misses here can be discussed with TAC and PM teams to determine why a signature may not exist for the sample.

Using the Kibana dashboard and Elasticsearch

Kibana is a simple visualization tool that pulls from an Elasticsearch data store.

Access Kibana

Kibana is accessed through a web interface, typically on port 5601. Accessing on a local machine:

https://localhost:5601

Importing Searches, Visualizations, and Dashboards

The pan-tort Kibana files are kept in the kibana_json folder. The 3 files can be easily imported to Kibana to be data view ready.

Import is found under Management –> Saved Objects. The Import button in the upper right corner.

_images/kibana_import.png

Import the 3 files in the following order to ensure no reference errors:

  1. hash_data_searches.json
  2. hash_data_visualizations.json
  3. hash_data_dashboards.json

Accessing the pan-tort dashboard

Use the Kibana menu to choose the dashboard option.

_images/kibana_menu.png

Select `pan_tort_dashboard`

The pan-tort dashboard will display.

Note

The data is time based and some test samples may be 6 years old. If the dashboard is not set to look back that far in time, update with the time selector in the upper right hand corner. The best option is to choose `Relative` and use 6 `Years Ago` to `Now`

Using the Search bar to filter results

When you run pan-tort, you are asked for a brief name for the query. This name is added to the data records. In the search bar, text box above the dashboard, enter the query name.

_images/kibana_search.png

This will filter the results to include on the specific query.

To switch between various queries, simple type in the name in the search box and the dashboard will udpate.

Elasticsearch quick loads and deletes

Elasticsearch is the document storage layer used with Kibana presentation. This is not intended as an Elasticsearch tutorial, but only to give quick commands for adding and remove bulk sets of data.

Sample data bulk load

The output of pan-tort used for Elasticsearch is in the out_estack directory. This is data is json formatted with data lines preceded by an index line to feed into Elasticsearch.

The general format to add data into Elasticsearch is:

curl -s -XPOST 'http://localhost:9200/_bulk' --data-binary @{filename}.json -H "Content-Type: application/x-ndjson"

If in the hash directory running pan-tort, then the curl command for a filename of pan_tort_sample would be:

curl -s -XPOST 'http://localhost:9200/_bulk' --data-binary @out_estack/pan_tort_sample.json -H "Content-Type: application/x-ndjson"

Files are loaded using the same pan-tort index and unique per query run using the query name.

Elasticsearch delete by query example

In some cases data may be entered as a duplicate or by accident. Using the dev tools option in Elasticsearch, data can be removed from the index specific to the query name.

The dev tools console is accessed from the main menu. Add the POST statement per below with the update query_tag name. Then hit the green play button to execute the POST. The results are displayed in the output window.

_images/kibana_delete.png

The example below can be cut-and-paste into the dev tools window with the query_tag updated for data to be deleted.

POST hash-data/_delete_by_query
{
        "size": 10000,
        "_source": "query_tag",
"query": {
        "match" : {
            "query_tag" : "pan_tort_sample"
        }
    }
}

The delete_by_query command can be executed for each query_tag value.

If all data is to be removed for a clean index, use the curl command below. This will delete ALL data in the hash-data index store so proceed with caution.

Delete existing data in the index

::
curl -XDELETE http://localhost:9200/{{elk_index_name}}

Pan-tort uses the index name of `hash-data`.

Warning

This command deletes ALL data in the index. Use only to reset to a clean data store.

Warning

This command deletes ALL data in the index. Use only to reset to a clean data store.

Release History

0.0

Initial release

Date: Feb 2018

  • Command line based Autofocus queries run serially per hash
  • Used panafapi.py for Autofocus queries

0.1

Date: Aug 2018

  • UI to input hashlist, hashtype, and query name values
  • Move to direct queries in python, no panafapi integration
  • Multi-query first stage and individual sig coverage lookups for faster run time
  • Enhanced data fields with malware tags/tag_groups and sig status
  • gettagdata.py to pull complete list of tags and groups from Autofocus
  • Use of query_tag attribute to isolate query runs: unique json output files and filter tag for Kibana
  • Multi-page and type=scan to support large scale input lists and query results