New features in Plaso 1.3: Hashing

Introduction

One of the features that we’ve added to Plaso in the 1.3 release is initial support for calculating file hashes. While there’s a lot more work to do to fully utilize the potential of file hashing, there are already some useful things you can do to make your forensic analysis more comprehensive or speedy.

What does it look like?

Here’s how you run log2timeline with hashing turned on:


➜  ~  log2timeline.py --status_view window --hashers all /tmp/hashing.plaso /tmp/test_directory
[INFO] Data files will be loaded from /usr/share/plaso by default.


Source path : /tmp/test_directory
Source type : directory


Processing started.


plaso - log2timeline version 1.3.0


Source path : /tmp/test_directory
Source type : directory


Identifier PID Status Events File
Collector 22026 completed
Worker_00 22024 sleeping 10 (0) OS:/tmp/test_directory/test_pe.exe
Worker_01 22025 sleeping 4 (0) OS:/tmp/test_directory/pivy.exe
StorageWriter 22023 sleeping 14 (0)


All extraction workers completed - waiting for storage.


Processing completed.

Easy! Just add --hashers all, or if you only want some hashers (currently, we have SHA256, SHA1 and MD5), you can pass the names instead. I've also usedthe ‘window’ status view option here, which is new in version 1.3, and is especially handy for long Plaso runs.
Once the hashes have been generated and stored in the storage file, you can use some analysis plugins to enrich or distill your data.
Here’s a plugin to get a list of the unique files (with hashes) from the source:

➜  ~  psort.py --analysis file_hashes --output-format null /tmp/hashing.plaso
[INFO] Data files will be loaded from /usr/share/plaso by default.
[INFO] Starting analysis plugins.
[INFO] Plugin: [file_hashes] started.
[INFO] Output processing is done.
[INFO] Processing data from analysis plugins.
[INFO] Waiting for analysis plugin: file_hashes to complete.
[INFO] Plugin file_hashes has completed.
[INFO] All analysis plugins are now completed.
Report generated from: file_hashes
Generated on: 2015-08-20T08:53:40+00:00


Report text:
Listing file paths and hashes
OS:/tmp/test_directory/pivy.exe: md5_hash=4a7e0c6f7bf030bfc7382c7ad482b216 sha1_hash=ad11393854e6761d094213f910cd28404f03e850 sha256_hash=66696b7a51d1d7f71b17c170acef1f08e8ca7f5e73f6a2a4b37aa1b7f175c42c
OS:/tmp/test_directory/test_driver.sys: md5_hash=a714a36e71e26c7011240e22cfd9c8ae sha1_hash=f01f95c90922998c963c765dd194f4976fdaa27c sha256_hash=891141f8e30708831e6cf3d482d8491b3b3fd3971b509b1ca6005c3d25833bbf
OS:/tmp/test_directory/test_pe.exe: md5_hash=ab2e0a9184d2718995d3f41c70df7027 sha1_hash=46f83aab7d6e527b212cce2ba558901ffa96f4a4 sha256_hash=e2fef8c075ae07cf0370165accadfd8765db3797f0c523742c914f397e191d09


/usr/lib/python2.7/dist-packages/plaso/lib/storage.py:866: UserWarning: Duplicate name: u'information.dump'
 self._zipfile.writestr(stream_name, stream_data)


*********************************** Counter ************************************
           Stored Events : 28
         Events Included : 28
      Duplicate Removals : 16
           Total Reports : 1

     Report: file_hashes : 1

Note the use of the ‘null’ output plugin to suppress event output, as we’re not interested in the events themselves at this point. Also, don't mind the warning here, it's just due to an update of the storage file to store the report. We're working on suppressing this message.

Now let’s have a look at something a little more interesting. Also new in 1.3 is an analysis plugin that checks the hashes of all Windows executable files (well, all PE files) in VirusTotal:

➜  ~  psort.py --analysis virustotal --virustotal-api-key <redacted> --output-format null /tmp/hashing.plaso
[INFO] Data files will be loaded from /usr/share/plaso by default.
[INFO] Starting analysis plugins.
[INFO] Plugin: [virustotal] started.
[INFO] Starting new HTTPS connection (1): www.virustotal.com
[INFO] Output processing is done.
[INFO] Processing data from analysis plugins.
[WARNING] virustotal may take a long time to run. It will not be automatically terminated.
[INFO] Waiting for analysis plugin: virustotal to complete.
[INFO] Plugin virustotal has completed.
[INFO] All analysis plugins are now completed.
Report generated from: virustotal
Generated on: 2015-08-20T08:54:29+00:00


Report text:
virustotal hash tagging Results
OS:/tmp/test_directory/pivy.exe: VirusTotal Detections 37
OS:/tmp/test_directory/pivy.exe: VirusTotal Detections 37
OS:/tmp/test_directory/test_driver.sys: Unknown to VirusTotal
OS:/tmp/test_directory/test_driver.sys: Unknown to VirusTotal
OS:/tmp/test_directory/test_pe.exe: Unknown to VirusTotal
OS:/tmp/test_directory/test_pe.exe: Unknown to VirusTotal


/usr/lib/python2.7/dist-packages/plaso/lib/storage.py:866: UserWarning: Duplicate name: u'information.dump'
 self._zipfile.writestr(stream_name, stream_data)


*********************************** Counter ************************************
           Stored Events : 28
         Events Included : 28
      Duplicate Removals : 16
           Total Reports : 1

      Report: virustotal : 1

Here, you can see from the report that one of the files had previously been identified by several different antivirus engines as malicious, and the other two haven't been scanned. VirusTotal is also a tagging plugin though, so as well as the analysis report, you can see all the VirusTotal tags in the regular psort output:

➜  ~  psort.py /tmp/hashing.plaso
[INFO] Data files will be loaded from /usr/share/plaso by default.
datetime,timestamp_desc,source,source_long,message,parser,display_name,tag,store_number,store_index
2008-01-06T14:51:31+00:00,Creation Time,PE,PE Compilation time,PE Type: Executable (EXE) Import hash: f9ade0aa18f660a34a4fa23392e21838,pe,OS:/tmp/test_directory/pivy.exe,VirusTotal Detections 37,2,0
2015-04-21T14:53:54+00:00,Content Modification Time,PE,PE Delay Import Time,DLL name: USER32.dll PE Type: Executable (EXE) Import hash: 8d0739063fc8f9955cc6696b462544ab,pe,OS:/tmp/test_directory/test_pe.exe,Unknown to VirusTotal,2,2
2015-04-21T14:53:54+00:00,Creation Time,PE,PE Compilation time,PE Type: Driver (SYS) Import hash: d9c9c4541168665f44917e3ddc4a00d5,pe,OS:/tmp/test_directory/test_driver.sys,Unknown to VirusTotal,2,1
2015-04-21T14:53:55+00:00,Content Modification Time,PE,PE Import Time,DLL name: KERNEL32.dll PE Type: Executable (EXE) Import hash: 8d0739063fc8f9955cc6696b462544ab,pe,OS:/tmp/test_directory/test_pe.exe,Unknown to VirusTotal,2,3
2015-04-21T14:53:56+00:00,Creation Time,PE,PE Compilation time,PE Type: Executable (EXE) Import hash: 8d0739063fc8f9955cc6696b462544ab,pe,OS:/tmp/test_directory/test_pe.exe,Unknown to VirusTotal,2,4
2015-08-20T08:43:30+00:00,atime;ctime;mtime,FILE,OS atime;ctime;mtime,OS:/tmp/test_directory/test_pe.exe,filestat,OS:/tmp/test_directory/test_pe.exe,Unknown to VirusTotal,2,6
2015-08-20T08:43:37+00:00,atime;ctime;mtime,FILE,OS atime;ctime;mtime,OS:/tmp/test_directory/test_driver.sys,filestat,OS:/tmp/test_directory/test_driver.sys,Unknown to VirusTotal,2,8
2015-08-20T08:48:50+00:00,atime;mtime,FILE,OS atime;mtime,OS:/tmp/test_directory/pivy.exe,filestat,OS:/tmp/test_directory/pivy.exe,VirusTotal Detections 37,2,9
2015-08-20T08:49:20+00:00,ctime,FILE,OS ctime,OS:/tmp/test_directory/pivy.exe,filestat,OS:/tmp/test_directory/pivy.exe,VirusTotal Detections 37,2,10
2015-08-20T08:52:05+00:00,atime,FILE,OS atime,OS:/tmp/test_directory/test_driver.sys,filestat,OS:/tmp/test_directory/test_driver.sys,Unknown to VirusTotal,2,12
2015-08-20T08:52:05+00:00,atime,FILE,OS atime,OS:/tmp/test_directory/test_pe.exe,filestat,OS:/tmp/test_directory/test_pe.exe,Unknown to VirusTotal,2,13
2015-08-20T08:52:05+00:00,atime,FILE,OS atime,OS:/tmp/test_directory/pivy.exe,filestat,OS:/tmp/test_directory/pivy.exe,VirusTotal Detections 37,2,11
[INFO] Output processing is done.


*********************************** Counter ************************************
           Stored Events : 28
         Events Included : 28

      Duplicate Removals : 16


This is possible due to a couple of other new features in 1.3, pysigscan to do faster file identification, and pefile to extract timestamps from Portable Executable format files.
Note that the VirusTotal plugin can be a little slow, unless you’ve got an API key which has a higher rate limit than the default of 4 hashes per minute.
So that’s what the hashing subsystem looks like in action, let’s take a look at what’s behind this, and how you can make this work for you.


What is the hashing support good for?

Mostly bad things. While we have plans to use this Plaso subsystem to help filter out “good” events and files, for the moment, Plaso’s hashing support is best for pointing out files and events that are “bad” (or notable in some other way). Down the line, we plan to add support for filtering out “good” (or irrelevant) events and files, but this is a little more complex.
If you activate hashing using the --hashers command line flag to log2timeline.py, Plaso will store file hashes for each file it processes. Once this is done, you can use an analysis plugin to look up the hashes and annotate events derived from files with tags, based on the file hash.
Version 1.3 comes with a couple of proof-of-concept plugins that do this, looking up Windows executables in VirusTotal (again, this will take a while with the default API key rate-limit) and the Viper binary analysis and management tool.
This feature will be most useful to examiners that have their own databases or systems containing files they want to alert on. Writing hash analysis plugins is pretty straightforward, and if there’s some system you want Plaso to talk to, have a go at writing an analysis plugin! The code for the Viper plugin is the best reference. If you have a system you want to talk to, but can’t work out what’s required, feel free to reach out on the development mailing list.

Performance impact

One of our initial concerns with adding hashing to Plaso was the performance impact. Normally, Plaso’s parsers test a fairly small amount of each file to determine if it can be parsed, but with hashing turned on Plaso needs to read all of every file to calculate the digest.
Thankfully testing thus far shows a very minimal performance impact, adding roughly 1 extra minute for every 30 minutes of log2timeline runtime in our real-world simulating tests. Obviously, this will vary from case, based on how the source data is stored (SSD? Network storage?) and the number and size of files in the source. We’re very interested in feedback on the performance of this feature, so if you find it slow and annoying, or quick and delightful, please let us know.
Given these numbers, we’re planning on enabling SHA256 digest calculation by default in version 1.4, the next Plaso release, depending on the performance we observe over the next few months.

Future plans

There are tons more things we’d like to add to the hashing subsystem:
  • Top priority is getting tagging more "good" things, to enable filtering out less relevant events.
  • More digest algorithms. In particularly sdhash and ssdeep
  • More analysis plugins, nsrlsvr and known-good databases
  • Extraction short-circuiting - if we’ve already extracted events from a file with this hash, skip it
  • Incorporating hash information into other analysis plugins (“this file is set up to autorun, and it’s not in the NSRL or VirusTotal - maybe look at this first”)

We’ll keep iterating on this, but check this feature out, and let us know if it’s useful to you, and what additions you’d most like to see.

Comments

  1. I am trying to get the virustotal to work but am getting the following error. I am using the Windows log2timeline version 1.3.0. Everything else seems to work fine:

    C:\plaso>psort.exe --analysis virustotal --virustotal-api-key --output-format null :\temp\dianes_hashed.plaso

    [INFO] Data files will be loaded from None by default.
    [WARNING] Unable to automatically determine data location.
    [INFO] Starting analysis plugins.
    Traceback (most recent call last):
    File "", line 646, in
    File "", line 641, in Main
    File "c:\Projects\plaso\build\psort\out00-PYZ.pyz\pickle", line 748, in save_global
    pickle.PicklingError: Can't pickle : it's not found as thread.lock

    any ideas?

    ReplyDelete

Post a Comment

Popular posts from this blog

Parsing the $MFT NTFS metadata file

Incident Response in the Cloud

Container Forensics with Docker Explorer