Sigma in Timesketch - let's rule the sketch
This article will walk you through the process of getting from a bare Timesketch installation to an environment where you can develop and use Sigma rules for Timesketch. This is the follow up of the article that covered the installation of a Timesketch development environment.
The target audience for this blogpost are engineers who are familiar with basic concepts of Timesketch and have a running Timesketch instance with running Celery workers. Some basic understanding of Sigma is helpful but not mandatory.
This article will explain how Sigma rules look like, what the structure of a sigma config looks like and how to write a new Sigma rule and modify the sigma config to get the expected result. For this example, we will write a Sigma rule to catch recon activity by detecting an installation of zenmap. This is outlined in MITRE ATT&CK® as Discovery: Discovery - An Information Security Reference That Doesn't Suck.
What is Sigma
Sigma, the "Yara for SIEMs (Security information and event management)" is an “open signature format” (pattern matching rule definition language). The idea behind it is to have a generic specification to define a situation a system wants to detect. Either the system itself is capable of translating the rule in its own language or Sigma can be used to translate it to a language the system supports.
What is a Sigma rule
A Sigma rule is written in YAML and defines the what and the where to look in system logs. Every Sigma rule also specifies some sort of metadata like author, a unique rule identifier (UUID) and references, e.g. an URL to additional information like a blogpost. Some examples for Sigma rules can be found at the Sigma project Github repository.
Each rule has the following sections:
title: Title of the rule
id: UUID that identifies the rule
description: free form text to further explain the context of the rule
reference: URL of a blogpost or other reference documents
author: it is a good practice to add a name and or email in here to be able to trace back who initially wrote the rule
date: creation date
modified: date the rule was modified
level: Severity level one of “low”, “medium”, “high” or “critical”
logsource: This is used to scope the searches, several combinations are possible and are determined by mapping in the sigma config yaml. Usually a combination of product and service.
detection: one or more selectors, timeframe and conditions (best is to look for already existing rules to get an idea of what is possible)
falsepositives: Description field to explain which events or situations might trigger that rule
What is sigma_config.yml
This is the central configuration file for the Sigma engine. Different Sigma configurations can be used for different backends. The configuration can be used to create different mappings of Sigma key names to those provided by the backend.
Within the configuration file, there are the following sections:
title: Title of the configuration file
backends: set of backends that this configuration is supposed to serve, e.g. es-dsl for elastic search or splunk.
logsources: Each logsource has a name and the following sub sections.
service: in most cases again the Name of the source
conditions: a set of conditions that will be translated
For example the following combination of rule / configuration:
What would a query look like? Starting at the rule, the backend is defined, which will tell Sigma in what output language the query is expected. There is the definition of service:shell, which means the mappings from the config file section service:shell will be used.
In that section, there is a condition defined: data_type and then two options, which can be seen in the query as:
This is then combined with the detection part of the rule looking for a keyword, which translates into:
For the query, both combined result in the query:
The idea is that if you the locations for a specific service, e.g. shell change because a new log file location or data_type needs to be queried, there is only an addition needed in the configuration file, every rule that is referring to that service, will automatically pull that in and look at the correct places.
How to know what to look for?
It is very hard to write rules or improve the mapping coverage without either having sample data e.g. by parsing a storage media image with Plaso, importing that file and looking at what the events look like.
The other option to get insight of how data would look like is actually looking
Use existing event data
To get a test event it is essential to use a source that you know has the event you are looking for. In this post we’ll use https://github.com/sans-blue-team/DeepBlueCLI/tree/master/evtx.
Let’s use Plaso to extract the information from the evtx file and import into Timesketch.
Mock the data
If there is not already live data that the rule should trigger on, test data is needed. For this example data will be used. Mocked events can be either CSV or JSONL, the structure of those files is explained in Github CreateTimelineFromJSONorCSV.
Another method to import (mocked) data is the API-client and importer client. To be able to mock data, the format, the fields and value structure must be known. This can either be researched by parsing and importing real live data to reverse the structure or directly look at the tools that generate the data to be imported by e.g. understanding the source code of FOSS tools.
1. Crafting the test data
An installation event of zenmap on a Linux machine as a JSONL could look like:
In this post we will not use the timestamp. If you plan to write more complex Sigma rules with time correlation, the timestamps of your test events should be mocked in a similar way.
Import this file into your Timesketch instance.
2. Create a new Sigma rule
Create the rule file
There is a general guide on how to write Sigma rules written by Florian Roth who is one of the original developers of Sigma: How to Write Sigma Rules and a wiki article in the Sigma project. There is also an article from SANS ISC about Sigma rules.
For the purpose of this exercise we will create a new file data/sigma/rules/linux called lnx_susp_zenmap.yml.
In the rule an identifier (UUID) is needed, preferable a random UUID (version 4), which can be generated using an online UUID generator or on several operating systems using:
The corresponding Sigma rule file has the following content:
Please note that all rules nested in data/sigma/rules are used for the Sigma analyzer.
Verify the rule file
If you installed Sigma, you can also verify the written rule with sigmac:
Run the Sigma analyzer
If you now run the analyzer from the Web UI and watch it run, you will see the following in the Celery output of e.g. your Docker instance:
The important thing to note here is the generate Timesketch query:
(data_type:("shell\:zsh\:history" OR "bash\:history\:command") AND "*apt\-get\ install\ zmap*")
As you see, it looks for two specific data types in combination with the term we put in the rule.
If you do not see a Tag assigned to the event, try running the query directly in your Timesketch UI.
Adjust field mapping
In this case, the "data_type":"apt:history:line" was not in the sigma_config.yml so the query would not match the event. To add the data_type, modify the sigma_config.yml:
Running the Sigma analyzer (second iteration)
If you run the analyzer from the UI again and watch it run, you will see the following:
If you refresh the UI, you will see the Tag attached to the event as expected:
Extending the rule
Now if we want to extend the rule to cover other tools, we could do that by adding more rows like:
But of course then the name of the rule should be changed to be more general.
3. Re-playing above
The rule and the test event are both available in the Timesketch Github repository. There is also a dedicated sketch on the Demo server of Timesketch that has the event data hosted. Use the username: demo and password: demo to access the sketch called "Sigma analyser demo"
It is advised before placing new rules into production systems, to test the rules against test data or historical data to reduce noise and improve coverage of edge cases.
This post covered the basics of developing and using Sigma rules for Timesketch. For more information or as a refresher also see Sigma Analyser in Timesketch. This will help to learn more about Sigma in Timesketch, e.g. where to put the rules, how to verify before putting into production if rules do not cause exceptions.
For any questions for Timesketch and or the Sigma usage of Timesketch, visit the Timesketch Channel on Open Source DFIR Slack community or raise an issue at https://github.com/google/timesketch/issues
For further reading on Sigma, please visit the Github repository of the project: https://github.com/Neo23x0/sigma.