Let's talk about time
Goal
This article explains the importance and challenges of time in digital forensics and incident response. You will learn how time is handled in various open source tools and get practical tips on managing time in your environment.
What is time?
Wikipedia defines time as "... the indefinite continued progress of existence and events that occur in an apparently irreversible succession from the past, through the present, into the future." It is foundational to almost every interaction in modern society. It is also essential to modern information technology’s function and interactions with humans as well as other systems.
How is time measured
The question of how time is measured alone could fill hundreds of pages. It kept Galieo, Newton and Einstein busy for a lifetime. For the sake of this article we will simplify the consideration of time to the unit of measurement we call a “second” on planet Earth.
A second can be defined as: "the duration of 9,192,631,770 [cycles] of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium 133 atom". It is possible to break up a second into smaller chunks, like milliseconds or microseconds, as well as group it into bigger chunks, like minutes, hours, days, and years.
How to express time
Children at around the age of two begin to learn the concept of time, initially as a relative quality. For example, they’re told that they can play for another 10 minutes before it is bedtime. They also learn the concept of tomorrow, yesterday and other relative time expressions.
Eventually there is a need to indicate an exact moment. This is where it starts to get tricky. In some countries, bedtime might be 20:10. The same time in other countries would be 08:10 PM (“post meridiem” or after noon) and breakfast would be 07:30 and 07:30 AM (“ante meridiem” or before noon).
Then there is the concept of specific weekdays: Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday. Furthermore there are months and years. To express an exact day all below are valid and point to the same day:
05.04.2021
Monday April 5th 2021
Time zones
Let's assume a child lives in Zurich and has relatives abroad in Los Angeles, and they want to schedule a video call after dinner, but before bedtime. Since countries around the globe want to express the time of the day relative to the sun (meaning for example that10 AM should be in the mid-morning), the concept of time zones was created. The video call appointment could be 07:30 PM for the child in Zurich, but 10:30 AM for the relatives in their respective local time.
Some countries practice a time shift (one hour in most cases) twice a year to adjust to longer and shorter days. This concept is called daylight saving time, or daylight time.
To coordinate times using different clock times, you need to include time zone information. So the video call appointment would be 07:30 PM CEST / 10:30 AM PDT. CEST is the Central European Summer Time and PDT the Pacific Daylight Time.
Local time zones and daylight savings can both change over the course of history. A country might decide to abandon daylight savings after a certain year. Iceland for example has not used daylight saving time since 1968.
Since various Time Zones might introduce ambiguity (especially with daylight saving time), the concept of UTC the Coordinated Universal Time has been established. If you want to be less ambiguous, use UTC where possible and everyone can calculate their corresponding local time accordingly.
Both time zones and daylight savings can change over the course of history. A country might decide to abandon daylight savings after a certain year. Iceland for example has not used summer time since 1968.
Time in digital systems
In computing, there is a differentiation between wall clock time, user time and CPU time. Wall clock time is the actual amount of time taken to perform a job as observed by a human. User time does not account for time spent on file I/O. The CPU time is what the CPU overall took to complete all tasks related to the user code. User time is the time that the actual code from the user was running. For the most part: wall time > CPU time > user time
Most computers and mobile devices determine time as a number representing the seconds (or smaller fractions of a second called ticks) from a given wall clock date and time. One example is POSIX / UNIX time summing up the number of seconds since 1 January 1970 00:00:00 UTC.
Special values
The concept of "infinity". This value can be set in various systems to represent either a value greater than all other set timestamps (positive infinity / +infinity) or a value lower than all other timestamps (negative infinity / -infinity).
Some systems have a function for now(). This will return the current time in the system's time zone.
To get the current day midnight, use today().
It is also worth reading the documentation for specific tools you might use to understand how they measure time. For example, PostgreSQL has a special time variable called allballs which will return 00:00:00.00 UTC. "Balls" is a military and NASA way of referring to zeros (which look like balls). Among USAF personnel it is common practice to refer to aircraft whose tail number is a single number preceded by multiple zeros as "Balls" and the last number of its tail number. When all the digits are zero, it's "all balls". There is also the concept to display time called Time ball.
Time in DFIR
File date and time values
Files on systems can have multiple date and time values depending on the operating systems and the corresponding version. For example on Windows, every file or folder has three date and time values:
Date Created
Date Modified
Date Accessed
These date and time values are typically stored as timestamps, for example the number of seconds since January 1, 1970 00:00:00 UTC (or POSIX timestamp). Each of the file date and time values can be of use in digital forensic investigation. For example in a malware investigation the Date Created might tell the investigations team when a system was compromised.
Recommendation: Look at the most common operating systems in your environment and find out which timestamps (e.g. on NTFS) are collected per file, and then verify that your tools can handle all of them correctly.
Log date and time values
Logs can either be collected on a host system or in a central repository. Depending on the setup and configuration, logs collected on a local system can be sent to a central repository. Also a subset is a feasible option. An example is Windows Event Logs, where only certain logs are sent over to a central repository while the complete logs remain on the host for a period of time.
Without the information when something happened, a log would only contain "something happened" which would make investigation way harder so logs in both cases should and usually do contain information about the time. Some logs have a timestamp that tells when a log entry was written, in some cases dedicated time information is added when a log entry was received e.g. in a situation where you collect logs in a central point. This might be useful as you could calculate the delay by the offset of both timestamps.
Recommendation: Go to your central log collection / SIEM tool and review what time information you have for each entry. Another recommendation is to store date and time values in UTC, especially with a central server. Storing local time is a useful addition.
Events in the real world
Besides digital artifacts and logs, real world events that happen outside of systems can be of relevance to an investigation. For example, a remediation event happened at a certain time. If you write that down in your toolset, every suspicious event after that time should be treated with higher urgency as it could indicate an aspect of a compromise that hasn’t been remediated.
Another example of a real world event could be the publication of a news article. Such an event is usually not captured in IT Security tools but could lead to opportunistic attackers using an article as a reason to start attacking your environment because they are unhappy with the topic of the article. One of such examples is the blog from Brian Krebs who suffered a DDOS attack and he assumed it was because of his previous reporting that was published. Being able to capture such events will enable you to tell a tangible story once your investigation is concluded.
Time division
In DFIR, seconds are often not sufficiently granular to describe events that occured on a computer. For example malware might execute several system calls in 100 milliseconds and rename and delete files relevant for analysis. If your tools only present you with a second granularity, it will be significantly harder (and in some cases impossible) to reconstruct, in what order events occurred. On modern computers several million timestamps can be updated in the same second.
Recommendation: Run your normal forensics workflow on one known piece of evidence and compare the timestamps in the original piece of evidence (e.g. E01 image) with the output (e.g. csv). In this test you can pay attention to the actual datetime storage granularity (how many 0,### can the value store) and the datetime value granularity (how many 0,### are actually used by the software that wrote the value).
Manipulating timestamps
Manipulating timestamps is a well known technique conducted by adversaries to hide evidence or interfere with investigations. For example an attacker could manipulate timestamps to mimic “Date Created” for a specific file that was placed in a folder to match the legitimate files in that folder.
This technique is known as timestomping (ID: T1070.006 in the MITRE ATT&CK® framework). To learn more about timestomping, have a look at the forensicswiki article, which also contains a practical example.
Recommendation: Let someone (maybe your red team) try to manipulate the host time and see if you have rules in place to detect it and how your logs reflect it. That goes for the time manipulating event itself, as well as the following events collected.
Time in DFIR tools
DFIR tools face three challenges when it comes to time:
Finding time information
Parsing time
Correlating events
Finding time information
In some cases, the presence of time information might be obvious, either by name or by documentation. One example of documentation of time info is Microsoft MDSN about File Times, which gives a detailed overview on File Time on Microsoft based systems.
Other time information is either not documented or encoded. One example is the following URL that directs to a tweet:
https://twitter.com/alexanderjaeger/status/1289184031737667586
The number at the end of the url:
Does not look like a timestamp, however, using unfurl will show some more info:
Just by looking at the number, it would be nearly impossible to know that there is time information embedded. The piece of information here is documented as Twitter snowflake.
Parsing time
Once your tool has detected something as a time information, there is the challenge to parse that information. Some examples of string representation of a the same timestamp (Wed 01 Apr 2020 00:00:00 UTC):
If you are writing a tool, depending on the language, there are built in functions to parse timestamps. For example Python has datetime. The challenge however is to detect the format.
The example above also highlights an issue that, for an international audience, it is not obvious from 04-01-2020 what the month is, in contrast to 2020-04-01.
If you are writing a piece of software that is producing time information, use one format (for example ISO8601) that is unambiguous, document it and stick to it. If you need to change the format, make sure to deprecate it with enough lead time, documentation, and error messaging.
Correlating events
Once you have different pieces of information from different sources, you want to correlate these. Luckily there are already tools that help you with that.
One example is log2timeline / plaso - "super timeline all the things" or Timesketch analyzers which automate the process of evidence handling and normalising the information across the dataset.
Recommendations
Verify your tools and software behavior
In some examples, you would expect some things, but in reality it is different. E.g. for certain versions of Outlook Microsoft Office, OLE compound file attachments are temporarily written to disk (in the SecureTemp folder). You would expect the file write date to point to the time when the attachment was opened. In reality the date and times that are stored inside the Office document attached to the email are used to set the date and times of the file system entry.
Time server (NTP)
To make the most out of the factor time as evidence in investigations, it is recommended to set
up and use a reference time server in your environment that will be used by all devices. NTP has its own challenges including clock drift, correction and polling delays, if you just get started with NTP, read up the man page of your NTP server (e.g. debian man page)
IETF has published a very good Memo on Network Time Protocol Best Current Practices which covers setup, operation and securing the NTP setup in your network.
Time zone settings
It is recommended to set all devices where possible to write logs in UTC. If it is not possible or inconvenient ensure that all logs produced and all artifacts you collect contain the time zone information.
Time information from Third parties
If you receive information from systems or people outside your organisation, double check all points above. If provided information shows a callback to a malicious domain in PST (-8) instead of UTC, your analyst might look at the wrong dataset to confirm or find the actual infected host and close the case.
Comments
Post a Comment