Sigma in Timesketch - let's rule the sketch

 

0. Background

This article will walk you through the process of getting from a bare Timesketch installation to an environment where you can develop and use Sigma rules for Timesketch. This is the follow up of the article that covered the installation of a Timesketch development environment. 


The target audience for this blogpost are engineers who are familiar with basic concepts of Timesketch and have a running Timesketch instance with running Celery workers. Some basic understanding of Sigma is helpful but not mandatory.


Hypothesis

This article will explain how Sigma rules look like, what the structure of a sigma config looks like and how to write a new Sigma rule and modify the sigma config to get the expected result. For this example, we will write a Sigma rule to catch recon activity by detecting an installation of zenmap. This is outlined in MITRE ATT&CK® as Discovery: Discovery - An Information Security Reference That Doesn't Suck. 


What is Sigma

Sigma, the "Yara for SIEMs (Security information and event management)" is an “open signature format” (pattern matching rule definition language). The idea behind it is to have a generic specification to define a situation a system wants to detect. Either the system itself is capable of translating the rule in its own language or Sigma can be used to translate it to a language the system supports.


What is a Sigma rule

A Sigma rule is written in YAML and defines the what and the where to look in system logs. Every Sigma rule also specifies some sort of metadata like author, a unique rule identifier (UUID) and references, e.g. an URL to additional information like a blogpost. Some examples for Sigma rules can be found at the Sigma project Github repository.


Each rule has the following sections:


  • title: Title of the rule

  • id: UUID that identifies the rule

  • description: free form text to further explain the context of the rule

  • reference: URL of a blogpost or other reference documents

  • author: it is a good practice to add a name and or email in here to be able to trace back who initially wrote the rule

  • date: creation date

  • modified: date the rule was modified

  • level: Severity level one of “low”, “medium”, “high” or “critical”

  • logsource: This is used to scope the searches, several combinations are possible and are determined by mapping in the sigma config yaml. Usually a combination of product and service.

  • detection: one or more selectors, timeframe and conditions (best is to look for already existing rules to get an idea of what is possible)

  • falsepositives: Description field to explain which events or situations might trigger that rule


What is sigma_config.yml

This is the central configuration file for the Sigma engine. Different Sigma configurations can be used for different backends. The configuration can be used to create different mappings of Sigma key names to those provided by the backend.


There is a set of configuration files provided by the Sigma project on Github. For Timesketch, the Sigma configuration is available from Github.


Within the configuration file, there are the following sections:


  • title: Title of the configuration file

  • backends: set of backends that this configuration is supposed to serve, e.g. es-dsl for elastic search or splunk.

  • logsources: Each logsource has a name and the following sub sections.

    • service: in most cases again the Name of the source

    • conditions: a set of conditions that will be translated


For example the following combination of rule / configuration:


Rule:


...

logsource:

   product: linux

   service: shell

detection:

   keywords:

       - '*whoami*'

   condition: keywords

...


Configuration file:


backends:

  - es-dsl

...

 shell:

    service: shell

    conditions:

      data_type:

        - "bash:history:command"

        - "apt:history:line"

...


What would a query look like? Starting at the rule, the backend is defined, which will tell Sigma in what output language the query is expected. There is the definition of service:shell, which means the mappings from the config file section service:shell will be used.

In that section, there is a condition defined: data_type and then two options, which can be seen in the query as:


(data_type:("apt\:history\:line" OR "bash\:history\:command")


This is then combined with the detection part of the rule looking for a keyword, which translates into:


AND "*whoami*"


For the query, both combined result in the query:


(data_type:("apt\:history\:line" OR "bash\:history\:command") AND "*whoami*")


The idea is that if you the locations for a specific service, e.g. shell change because a new log file location or data_type needs to be queried, there is only an addition needed in the configuration file, every rule that is referring to that service, will automatically pull that in and look at the correct places.

How to know what to look for?

It is very hard to write rules or improve the mapping coverage without either having sample data e.g. by parsing a storage media image with Plaso, importing that file and looking at what the events look like.


The other option to get insight of how data would look like is actually looking 

Use existing event data

To get a test event it is essential to use a source that you know has the event you are looking for. In this post we’ll use https://github.com/sans-blue-team/DeepBlueCLI/tree/master/evtx


Let’s use Plaso to extract the information from the evtx file and import into Timesketch.

sudo docker run -v /home/jaegeral/Download/:/data log2timeline/plaso log2timeline /data/plaso/many-events-system.evtx.plaso /data/DeepBlueevtx/many-events-system.evtx


Mock the data

If there is not already live data that the rule should trigger on, test data is needed. For this example data will be used. Mocked events can be either CSV or JSONL, the structure of those files is explained in Github CreateTimelineFromJSONorCSV.


Another method to import (mocked) data is the API-client and importer client. To be able to mock data, the format, the fields and value structure must be known. This can either be researched by parsing and importing real live data to reverse the structure or directly look at the tools that generate the data to be imported by e.g. understanding the source code of FOSS tools.

1. Crafting the test data

An installation event of zenmap on a Linux machine as a JSONL could look like:


{

  "message": "A message",

  "timestamp": 123456789,

  "datetime": "2015-07-24T19:01:01+00:00",

  "timestamp_desc": "Write time",

  "extra_field_1": "foo"

}

{

  "message": "Another message",

  "timestamp": 123456790,

  "datetime": "2015-07-24T19:01:02+00:00",

  "timestamp_desc": "Write time",

  "extra_field_1": "bar"

}

{

  "message": "Yet more messages",

  "timestamp": 123456791,

  "datetime": "2015-07-24T19:01:03+00:00",

  "timestamp_desc": "Write time",

  "extra_field_1": "baz"

}

{

  "message": "Install: zmap:amd64 (1.1.0-1) [Commandline: apt-get install zmap]",

  "timestamp": 123456791,

  "datetime": "2015-07-24T19:01:03+00:00",

  "timestamp_desc": "foo",

  "command": "Commandline: apt-get install zmap",

  "data_type": "apt:history:line",

  "display_name": "GZIP:/var/log/apt/history.log.1.gz",

  "filename": "/var/log/apt/history.log.1.gz",

  "packages": "Install: zmap:amd64 (1.1.0-1)",

  "parser": "apt_history"

}

In this post we will not use the timestamp. If you plan to write more complex Sigma rules with time correlation, the timestamps of your test events should be mocked in a similar way. 

Import this file into your Timesketch instance.

2. Create a new Sigma rule

Create the rule file

There is a general guide on how to write Sigma rules written by Florian Roth who is one of the original developers of Sigma: How to Write Sigma Rules and a wiki article in the Sigma project. There is also an article from SANS ISC about Sigma rules.


For the purpose of this exercise we will create a new file data/sigma/rules/linux  called lnx_susp_zenmap.yml.


In the rule an identifier (UUID) is needed, preferable a random UUID (version 4), which can be generated using an online UUID generator or on several operating systems using:


uuidgen -r


The corresponding Sigma rule file has the following content:


title: Suspicious Installation of Zenmap

id: 5266a592-b793-11ea-b3de-0242ac130004

description: Detects suspicious installation of Zenmap

references:

    - https://rmusser.net/docs/ATT&CK-Stuff/ATT&CK/Discovery.html

author: Alexander Jaeger

date: 2020/06/26

modified: 2020/06/26

logsource:

    product: linux

    service: shell

detection:

    keywords:

        # Generic suspicious commands

        - '*apt-get install zmap*'

    condition: keywords

falsepositives:

    - Unknown

level: high


Please note that all rules nested in data/sigma/rules are used for the Sigma analyzer.


Verify the rule file

If you installed Sigma, you can also verify the written rule with sigmac:


python3 sigmac -t es-qs --config ../../sigma_config.yaml ../data/sigma/rules/linux/lnx_susp_zenmap.yml


Run the Sigma analyzer

If you now run the analyzer from the Web UI and watch it run, you will see the following in the Celery output of e.g. your Docker instance:


[2020-06-26 10:16:35,473: INFO/ForkPoolWorker-15] [sigma] Reading rules from /usr/local/src/timesketch/data/linux/lnx_susp_zenmap.yml

[2020-06-26 10:16:35,478: INFO/ForkPoolWorker-15] [sigma] Generated query (data_type:("shell\:zsh\:history" OR "bash\:history\:command") AND "*apt\-get\ install\ zmap*")


The important thing to note here is the generate Timesketch query: 


(data_type:("shell\:zsh\:history" OR "bash\:history\:command") AND "*apt\-get\ install\ zmap*")


As you see, it looks for two specific data types in combination with the term we put in the rule.


If you do not see a Tag assigned to the event, try running the query directly in your Timesketch UI.

Adjust field mapping

In this case, the "data_type":"apt:history:line" was not in the sigma_config.yml so the query would not match the event. To add the data_type, modify the sigma_config.yml:


shell:

    service: shell

    conditions:

      data_type:

        - "shell:zsh:history"

        - "bash:history:command"

        - "apt:history:line"


Running the Sigma analyzer (second iteration)

If you run the analyzer from the UI again and watch it run, you will see the following:


...

[2020-06-26 10:19:29,909: INFO/ForkPoolWorker-15] [sigma] Reading rules from /usr/local/src/timesketch/data/linux/linux called lnx_susp_zenmap.yml

[2020-06-26 10:19:29,913: INFO/ForkPoolWorker-15] [sigma] Generated query (data_type:("shell\:zsh\:history" OR "bash\:history\:command" OR "apt\:history\:line") AND "*apt\-get\ install\ zmap*")

[2020-06-26 10:19:30,137: INFO/ForkPoolWorker-15] [sigma_linux] result: Applied 1 tags

* recon_commands: 0* linux called lnx_susp_zenmap: 1* reverse_shell: 0


If you refresh the UI, you will see the Tag attached to the event as expected:


Extending the rule

Now if we want to extend the rule to cover other tools, we could do that by adding more rows like:


title: Suspicious Installation of Zenmap

id: 5266a592-b793-11ea-b3de-0242ac130004

description: Detects suspicious installation of Zenmap

references:

    - https://rmusser.net/docs/ATT&CK-Stuff/ATT&CK/Discovery.html

author: Alexander Jaeger

date: 2020/06/26

modified: 2020/06/26

logsource:

    product: linux

    service: shell

detection:

    keywords:

        # Generic suspicious commands

        - '*apt-get install zmap*'

        - '*apt-get install nmap*'

    condition: keywords

falsepositives:

    - Unknown

level: high


But of course then the name of the rule should be changed to be more general.


3. Re-playing above

The rule and the test event are both available in the Timesketch Github repository. There is also a dedicated sketch on the Demo server of Timesketch that has the event data hosted. Use the username: demo and password: demo to access the sketch called "Sigma analyser demo"


It is advised before placing new rules into production systems, to test the rules against test data or historical data to reduce noise and improve coverage of edge cases.

4. Questions

This post covered the basics of developing and using Sigma rules for Timesketch. For more information or as a refresher also see Sigma Analyser in Timesketch. This will help to learn more about Sigma in Timesketch, e.g. where to put the rules, how to verify before putting into production if rules do not cause exceptions.


For any questions for Timesketch and or the Sigma usage of Timesketch, visit the Timesketch Channel on Open Source DFIR Slack community or raise an issue at https://github.com/google/timesketch/issues


For further reading on Sigma, please visit the Github repository of the project: https://github.com/Neo23x0/sigma.

Comments

Popular posts from this blog

Incident Response in the Cloud

Forensic Disk Copies in GCP & AWS

Introducing Libcloudforensics