Set up a development environment for Timesketch

Background

This article should walk you through the process for creating an environment where you can develop on Timesketch.


The target audience for this blogpost is an engineer who is familiar with Python, git and GitHub and has some basic understanding of operating systems as well.

What is Timesketch?


Timesketch is an open source tool for collaborative (digital) forensic timeline analysis. Those timelines can be from separate systems and investigated by multiple analysts in parallel. Timesketch is written in Python 3. Elasticsearch is used as the storage backend together with a SQL database to store additional attributes and metadata.


One of the benefits of open source digital forensic and incident response software (OSDFIR) is the ability to write code to extend the capabilities and to match their own workflow. This article will explain how to set up a development environment and how to contribute to the Timesketch project (also referred to as upstream).

System requirements

  • Linux / MacOS (Windows is not yet supported)

  • Minimum of 8 GB RAM (more is preferred)

  • Docker


NOTE: It is not recommended to try to run on a system with less than 8 GB of RAM.

Install software

1. Install Docker (depends on OS)

Docker is used to easily deploy mostly self-contained environments without the need to change the host environment. Which saves a lot of time in making sure you have a working build/run environment.


Timesketch provides pre-configured Docker containers for production and development purposes. To install docker, follow the official instructions here.


Test your Docker installation by running:


sudo docker run hello-world

Hello from Docker!

This message shows that your installation appears to be working correctly.


If this is not the case, please go back and troubleshoot your Docker installation and make sure Docker is working.

Adjust memory settings for Docker

The Docker containers that will run are very memory intensive, thus a modification of maximum memory that can be used by a container is recommended.


sysctl -w vm.max_map_count=262144

sysctl -n vm.max_map_count

262144


NOTE: If your output does not match the above, please do not continue, as it might result in the Elasticsearch Docker container not running as expected.


For different operating systems, please have a look here.

2. Install Docker Compose

Docker Compose is used to manage the Timesketch development Docker container. It allows developers to easily update the Docker build/run environment configuration if needed.


To install Docker Compose, follow the official instructions here


Test your installation of docker-compose by running:


docker-compose --version

docker-compose version 1.25.4, build 8d51620a 


If this is not the case, please go back and troubleshoot your installation of Docker Compose and make sure Docker Compose is working.

3. Create a personal fork of Timesketch

A personal fork of Timesketch on GitHub is needed to commit code to the upstream Timesketch project. The flow for this is you push code to your personal Timesketch fork and once you are happy, you create a pull request to ask Timesketch maintainers to add your code changes to the upstream project.

 

  1. Login to Github

  2. Navigate to https://github.com/google/timesketch

  3. Click the “Fork” button in the upper right corner.

  4. Go to https://github.com/YOURUSERNAME/timesketch

4. Clone your personal Timesketch fork on your local system

To be able to make changes to the Timesketch code, make a clone of your personal Timesketch fork on your local system by running:


git clone https://github.com/<YOUR GITHUB USERNAME>/timesketch.git

5. Configure git to sync upstream changes

Adding the upstream repo to your local repository will allow you to easily update your local repo (pull changes) with upstream at a later point.


cd timesketch

git remote add upstream https://github.com/google/timesketch.git


To pull changes from upstream and apply local changes on top of them:


git fetch upstream && git pull --rebase upstream master


Deploy and start containers

It is now time to deploy your development environment! Docker-compose will setup 4 containers:


  • Elasticsearch

  • PostgreSQL

  • Redis

  • Timesketch


Now those 4 containers need to be started and some services within the containers have to be manually run to get Timesketch running. Timesketch relies on a webserver and workers to run to import uploaded data or run analyzers on that data.


Security remark: The datastore containers don't expose any ports by default because they communicate using the internal Docker network. The only port that is exposed is the Timesketch web server at 127.0.0.1:5000.

1. Deploy containers

The following command will deploy and start your different containers.


cd docker/dev

sudo /usr/local/bin/docker-compose up -d


To verify all worked well run the following. 


sudo docker container list

CONTAINER ID        IMAGE                                                 COMMAND                  CREATED             STATUS              PORTS                      NAMES

9f9da2048f2b        dev_timesketch                                        "/docker-entrypoint.…"   20 seconds ago      Up 18 seconds       127.0.0.1:5000->5000/tcp   dev_timesketch_1

31c965426f5d        postgres                                              "docker-entrypoint.s…"   21 seconds ago      Up 19 seconds       5432/tcp                   dev_postgres_1

cb16075eab09        docker.elastic.co/elasticsearch/elasticsearch:7.6.0   "/usr/local/bin/dock…"   21 seconds ago      Up 19 seconds       9200/tcp, 9300/tcp         dev_elasticsearch_1

197f9f7e5591        redis                                                 "docker-entrypoint.s…"   21 seconds ago      Up 19 seconds       6379/tcp                   dev_redis_1



NOTE: If your output does not match the above, please do not continue, as it might result in the Elasticsearch Docker container not running as expected.


2. Start Timesketch

When all containers are running it is time to start the development server and the ingestions workers. You want to run these manually as you need to see STDOUT/ERR while developing.


  1. First you need to know the container identifier (CONTAINER_ID) of the Timesketch container:


CONTAINER_ID="$(sudo docker container list -f name=dev_timesketch -q)"

echo $CONTAINER_ID

88850baaaaa7


  1. Now run:


sudo docker logs -f $CONTAINER_ID


This command might take a while to complete as it will pull several containers and install software in those various Docker containers, so have a cup of tea or coffee while waiting.


  1. After that you should see:


Timesketch development server is ready!


If that is not the case, then please have a look at the messages in the terminal. Occasionally  software can not be installed, which is usually caused by network problems which prevent Docker from accessing the internet.


For the next steps you need two parallel shells.

3. Start the webserver (first shell)

Start the development web server. Any changes to the python code in your repository will automatically restart the server for you. 


# Shell one

CONTAINER_ID="$(sudo docker container list -f name=dev_timesketch -q)"

echo $CONTAINER_ID

sudo docker exec -it $CONTAINER_ID gunicorn --reload -b 0.0.0.0:5000 --log-file - --timeout 120 timesketch.wsgi:application


As you can see, we set the CONTAINER_ID environment variable again, because they are not carried over to new shells. So if you close the shell, your webserver will be stopped and you need to start it in a new shell.


After a few moments, you should see something like


[2020-05-06 14:36:30 +0000] [71] [INFO] Starting gunicorn 19.10.0

[2020-05-06 14:36:30 +0000] [71] [INFO] Listening at: http://0.0.0.0:5000 (71)

[2020-05-06 14:36:30 +0000] [71] [INFO] Using worker: sync

[2020-05-06 14:36:30 +0000] [80] [INFO] Booting worker with pid: 80


That means your webserver is running. Congratulations! =)

4. Start the ingestion workers

Start the background ingestion workers (Celery) if you plan to upload timelines or develop new analyzers. NOTE Celery will not auto-reload when you make changes, you will have to CTRL-C and restart manually while you develop.


# Shell two

CONTAINER_ID="$(sudo docker container list -f name=dev_timesketch -q)"

echo $CONTAINER_ID

sudo docker exec -it $CONTAINER_ID celery -A timesketch.lib.tasks worker --loglevel info


After a few moments, you should see the following:


/usr/local/lib/python3.6/dist-packages/celery/platforms.py:801: RuntimeWarning: You're running the worker with superuser privileges: this is

absolutely not recommended!


Please specify a different user using the --uid option.


User information: uid=0 euid=0 gid=0 egid=0


  uid=uid, euid=euid, gid=gid, egid=egid,

 

 -------------- celery@e1a719982a37 v4.4.0 (cliffs)

--- ***** ----- 

-- ******* ---- Linux-4.19.76-linuxkit-x86_64-with-Ubuntu-18.04-bionic 2020-05-06 14:39:16

- *** --- * --- 

- ** ---------- [config]

- ** ---------- .> app:         timesketch:0x7fdf2ec45128

- ** ---------- .> transport:   redis://redis:6379//

- ** ---------- .> results:     redis://redis:6379/

- *** --- * --- .> concurrency: 2 (prefork)

-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)

--- ***** ----- 

 -------------- [queues]

                .> celery           exchange=celery(direct) key=celery

                


[tasks]

  . timesketch.lib.tasks.run_csv_jsonl

  . timesketch.lib.tasks.run_email_result_task

  . timesketch.lib.tasks.run_index_analyzer

  . timesketch.lib.tasks.run_plaso

  . timesketch.lib.tasks.run_sketch_analyzer

  . timesketch.lib.tasks.run_sketch_init


[2020-05-06 14:39:16,587: INFO/MainProcess] Connected to redis://redis:6379//

[2020-05-06 14:39:16,613: INFO/MainProcess] mingle: searching for neighbors

[2020-05-06 14:39:17,656: INFO/MainProcess] mingle: all alone

[2020-05-06 14:39:17,724: INFO/MainProcess] celery@e1a719982a37 ready.


NOTE: The warning about running a worker with superuser privileges can be ignored, as we do not plan to expose the system to anything other than localhost (127.0.0.1, ::1). If you plan to expose Timesketch to any other IP address, please read up on securing Docker containers first.

Using Timesketch

  1. Open your browser and visit: http://127.0.0.1:5000


You should see something similar to:



  1. The default credentials for the development Docker are:

Username: dev

Password: dev 

Add your first data

  1. Click on "create sketch"

  1. Give a name and a description for your sketch


  1. Click Save to create your sketch


  1. Your sketch should now show up:

  1. Now you can add your timelines.

Make changes

Open another shell (if you are using a code editor, the following flow might be completely different). For demonstration purposes we will continue with our workflow to be executed in your shell only.


  • Navigate to your Timesketch folder. 


Most of the code you  want to modify is nested in timesketch/timesketch/ folder.

Make a branch

It is recommended to have one branch per feature you are working on, to keep your changes separated from each other. To achieve this, we will create a new branch called demo


git checkout -b demo

Switched to a new branch 'demo'

Change code

For demonstration purposes, we will modify the code of the file "timesketch/views/auth.py"


vi timesketch/views/auth.py



And change from:


@auth_views.route('/login/', methods=['GET', 'POST'])

def login():

    """Handler for the login page view.


To


@auth_views.route('/login/', methods=['GET', 'POST'])

def login():

    def login():

    """Handler for the login page view.

    There are three ways of authentication.

    1) Google Cloud Identity-Aware Proxy.

    2) If Single Sign On (SSO) is enabled in the configuration and the

       environment variable is present, e.g. REMOTE_USER then the system will

       get or create the user object and setup a session for the user.

    3) Local authentication is used if SSO login is not enabled. This will

       authenticate the user against the local user database.

    Returns:

        Redirect if authentication is successful or template with context

        otherwise.

    """

    print("Timesketch rocks")


This will cause a printed "Timesketch rocks" in the logs with every login.

Save the file.


  • Go back to your browser

  • click logout

  • log on again


Everything seems the same, but if you now go to your shell where you started the webserver, you will see:


[2020-05-06 15:08:26 +0000] [99] [INFO] Worker reloading: /usr/local/src/timesketch/timesketch/views/auth.py modified

[2020-05-06 15:08:26 +0000] [99] [INFO] Worker exiting (pid: 99)

[2020-05-06 15:08:26 +0000] [103] [INFO] Booting worker with pid: 103

Timesketch rocks

Timesketch rocks


Let's look at the lines one by one:

  • The worker detects a change in the auth.py file

  • the worker closes itself

  • The worker is restarting

  • You authenticated and the message was printed here.


It is important to understand that you do not have to restart anything yourself. The environment will detect changes on its own until something breaks, then error messages will be shown.


Note: If you change or add code to an analyzer, you need to restart your celery worker in your second shell.

Commit changes to your branch

Once you are happy with your changes, run:


git status

On branch demo

Changes not staged for commit:

  (use "git add <file>..." to update what will be committed)

  (use "git restore <file>..." to discard changes in working directory)

modified:   auth.py


So you can now either commit the changed auth.py or restore the original one. For demonstration purposes, let's assume we want to commit.


To commit your changes with a commit message, run


git commit auth.py -m "for demo purposes only"

[demo 8bbaf58] for demo purposes only

 1 file changed, 1 insertion(+)


Push your changes to your fork


git push --set-upstream origin demo

Enumerating objects: 9, done.

Counting objects: 100% (9/9), done.

Delta compression using up to 4 threads

Compressing objects: 100% (5/5), done.

Writing objects: 100% (5/5), 444 bytes | 88.00 KiB/s, done.

Total 5 (delta 4), reused 0 (delta 0), pack-reused 0

remote: Resolving deltas: 100% (4/4), completed with 4 local objects.

remote: 

remote: Create a pull request for 'demo' on GitHub by visiting:

remote:      https://github.com/<YOUR GITHUB USERNAME>/timesketch/pull/new/demo

remote: 

Create a pull request

By creating a pull request you are contributing to our code base. So before going ahead with pressing the putting have a look at ‘https://github.com/google/timesketch/blob/master/CONTRIBUTING.md’ to understand what it means to contribute code. 


To create a pull request:


Conclusion

Congratulations, you now have a working setup of Timesketch to develop and add functionality. 

For the future we plan a blog post on creating tests for your Timesketch code, so stay tuned for that.


If you run into problems take a look at the issues page in the Timesketch GitHub repository, to see if other people have seen the issue before. If nothing there helps, ask for help on the Open Source DFIR slack or open an issue on the tracker.

Comments

Popular posts from this blog

Parsing the $MFT NTFS metadata file

Incident Response in the Cloud

Container Forensics with Docker Explorer