Windows Container Forensics
Linux container security has been covered in a number of blog posts and conference presentations, including our previous post about Container Forensics with Docker Explorer. However, when we came across Windows containers during an investigation we noticed their implementation was quite different and not well documented from a forensics perspective. Despite finding some details about containerised Windows Registry hives in Maxim Suhanov’s blog post dfir.ru, not much had been written about how Windows implemented container filesystems.
This post will detail the research process and useful findings about Windows containers. It primarily focuses on the filesystem layers and does not cover containerised registry hives.
One of the most informative resources we found on how Windows containers work was a DockerCon talk from 2016 titled "Windows Server & Docker - The Internals Behind Bringing Docker & Containers to Windows - Black Belt''. To summarise, containers traditionally rely on Linux features such as namespaces and cgroups which are not present in Windows. To work around this, some changes were made to Windows kernel space components to create similar functionality such as:
Extending job objects to include a "silo" concept to provide resource isolation.
Namespace virtualisation including separate object namespaces.
One of the crucial differences from a forensics perspective is how Windows containers handle filesystems. The DockerCon talk mentions that it is difficult to build a full union filesystem like those used for Linux containers because Windows applications expect certain NTFS features to be present. Instead, Microsoft came up with a hybrid model involving a virtual block device and NTFS partition per container.
While the Windows container API is not publicly documented, Microsoft has provided language bindings in:
Go - under the hcsshim project. This is how the Docker project interfaces with the Windows container API.
Windows containers also offer two different "isolation" modes:
Process isolation, where container processes run on the host kernel.
Hyper-V isolation, where containers run in a minimal Hyper-V virtual machine.
Microsoft also makes a distinction between Windows Containers on Windows (WCOW) and Linux Containers on Windows (LCOW) which involves running Linux containers under a Hyper-V VM or Windows Subsystem for Linux (WSL). This blog will mostly focus on WCOW.
Inspecting Docker Artifacts
Installing Docker and pulling an image
For research purposes we started with a fresh Windows Server 2019 VM, installed Docker with the instructions listed here and then pulled down a nanoserver container image:
On Windows, the Docker root is under c:\ProgramData\docker, shown below with irrelevant directories omitted:
The windowsfilter directory contains container filesystems and will be of interest for forensics. Looking at the directory layout above:
Files: Contains the read-only files for the image layer.
Hives: Contains the base registry hives used for containerised registry hives.
UtilityVM: Files related to the VM for Hyper-V isolation containers.
blank-base.vhdx/blank.vhdx: These are related to the "virtual block device per container" as alluded to in the DockerCon presentation.
layerchain.json: is null in this case but references the next layer for a container.
Next, we ran a container, and created a file for later inspection:
Reviewing the windowsfilter directory:
As expected there is a new subdirectory 5da3305682... for the created container with newly created files sandbox.vhdx and layerchain.json, which references the parent directory:
Based on the DockerCon talk and what we know so far, it appears that Windows containers use differencing vhdx disks to manage the writable "scratch" layer for containers with:
Each container having a writable differential disk sandbox.vhdx
The parent disk set to the upper layer's blank-base.vhdx
blank-base.vhdx just contains an NTFS volume with an empty WcSandboxState directory. As this disk doesn't contain any layer-related files, it appears that the relationship between these files is more to reduce the size of the per-container sandbox.vhdx rather than to manage any kind of container/image layer relationships.
This is just a blank differencing vhdx disk with its parent set to blank-base.vhdx. When a new container is created from this image this file is copied and renamed to sandbox.vhdx in the container's directory, confirmed by creating a new container and checking that the hash of sandbox.vhdx matches the hash of blank.vhdx in the upper layer:
The vhdx specification shows that the parent disk indicator is contained in the metadata of the disk, running strings against the disk confirms that sandbox.vhdx's parent is set to blank-base.vhdx from the parent image:
Adding support to Docker Explorer for mounting these container filesystems requires a way to mount differencing vhdx files. Although this is trivial on Windows using built in tools, at the time of writing this blog it was unsupported on Linux.
The short term hacky solution was a Python script to merge the two vhdx files and output a raw image that could be mounted. A standalone tool merge_vhdx.py has been added to the docker-explorer GitHub repository which is then invoked by docker-explorer to output a raw container image:
So far we've figured out how Windows containers use differential vhdx files to manage the writable container layer with the sandbox.vhdx->blank-base.vhdx relationship primarily to reduce disk usage rather than manage container layers.
There is still something missing from our understanding, namely how the filesystem layer relationship works. The previous section hinted at this with the “unsupported reparse point” message. Viewing a few more files in our mounted image shows more of these reparse points:
These appear to be files in the parent layer not modified within our container. Testing this with an unmodified file:
Then for a container where the same file has been modified, it is now present without a reparse point:
Reviewing the MFT reparse attribute (192-3) for this file (inode 353):
According to Microsoft documentation this is a REPARSE_DATA_BUFFER data element, of which the first four bytes are the reparse tag of 80000018 (accounting for endian-ness). This tag corresponds to IO_REPARSE_TAG_WCI: "Used by the Windows Container Isolation filter. Server-side interpretation only, not meaningful over the wire."
While IO_REPARSE_TAG_WCI appears to be undocumented by Microsoft, Ladislav Zezula has provided a definition in their FileTest tool. Using this definition for our attribute we end up with:
WciName obviously corresponds to the file in the parent image and LookupGuid is an identifier for the next layer which in this case is expected to be ebf46384a2e8[...]. After further review of the hcsshim project, it appears that this GUID is generated by calling vmcompute!nametoguid with the layer name as an argument.
Windows containers are more complex than their Linux counterparts in order to provide features expected by Windows applications. Rather than a simple union file system they use a combination of virtual block devices (vhdx files) and NTFS reparse points to manage container layers.
Currently Docker Explorer can mount the virtual block device but does not properly handle the NTFS reparse points. This will still allow the writable layer of a container to be mounted which is likely to be of most interest during an investigation, and then any unmodified files can be found in the parent image manually.
dfir.ru blog: "Containerized registry hives in Windows" https://dfir.ru/2020/08/15/containerized-registry-hives-in-windows/
DockerCon '16 Presentation: "Windows Server & Docker - The Internals Behind Bringing Docker & Containers to Windows" https://www.youtube.com/watch?v=85nCF5S8Qok
GitHub: Docker Explorer Tool https://github.com/google/docker-explorer
GitHub: FileTest - IO_REPARSE_TAG_WCI definition https://github.com/ladislav-zezula/FileTest/blob/master/WinSDK.h#L658
GitHub: Microsoft hcsshim Project https://github.com/microsoft/hcsshim
GitHub: Moby Project Windows Graph Driver https://github.com/moby/moby/blob/master/daemon/graphdriver/windows/windows.go
Microsoft: Containers on Windows documentation https://aka.ms/containers
Microsoft: Get started - Prep Windows for containers https://docs.microsoft.com/en-us/virtualization/windowscontainers/quick-start/set-up-environment?tabs=Windows-Server
Microsoft: REPARSE_DATA_BUFFER definition https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c3a420cb-8a72-4adf-87e8-eee95379d78f
Microsoft: Technet Article - "Introducing the Host Compute Service (HCS)" https://techcommunity.microsoft.com/t5/containers/introducing-the-host-compute-service-hcs/ba-p/382332