Container image formats under the hood
Agata Krajewska
November 18, 2020
0 mins readOver the last few years, following Docker's release, containers have become more and more the standard mechanism for software delivery.
We see a growing number of container-based solutions and while innovation in the space is obviously welcomed, there is a requirement for establishing certain standards around format and runtime.
Because of the rapid growth of Docker project, Docker images became a standard for many purposes, but with no doubt, there is widespread interest in a single, open container specification, which is:
not bound to higher-level constructs such as a particular client or orchestration stack,
not tightly associated with any particular commercial vendor or project,
portable across a wide variety of operating systems, hardware, CPU architectures, public clouds, etc.
OCI (Open Container Initiative) is a Linux Foundation project to design open standards for operating-system-level virtualization, most importantly Linux containers.
Various open-source build tools support the OCI image format now, including:
BuildKit: an optimized rewrite of Docker's build engine
Podman: an alternative implementation of Docker's command-line tool
Buildah: a command-line alternative to writing Dockerfiles
Snyk supports all of the common container image formats, and because of that, we can integrate with a wide range of tools from across the ecosystem. In this blog post, we'll look at some of these, and show how Snyk works with them!
Docker archive
Firstly let's take a look at the Docker archive format. This is part of the deprecated v1 of the docker image specification, and you can find the full specification here. When you run docker save
, this is the default format that Docker client will output, which is used for backward compatibility with other tools. Container registries use a newer format to store the images
If you're running Docker locally and using the Snyk CLI or any of our container integrations, we will use your local Docker instance to pull & save the image to the filesystem and start the analysis.
Let's explore what that archive looks like, and what in there is interesting for Snyk!
I'll be using the ubuntu:bionic
image in this post. First, let's save the archive into your local filesystem:
docker save --output ubuntu-docker.tar ubuntu:bionic
The bionic
distribution has three image layers, which we'll be looking closer at.
After we've unpacked the tar
archive, we can inspect what's inside:
[ubuntu-docker] ll
total 24
drwxr-xr-x 5 agatakrajewska staff 160B 25 Sep 23:33 46076a325f0de3f745254638b8b0f0de343685b34e7ca6ec5cd0b6b7930eb7fa
drwxr-xr-x 5 agatakrajewska staff 160B 25 Sep 23:33 468327b5cd7ce539db695bd0ef05dae8a4ff77b02870a8e823ed74dedad4bd55
-rw-r--r-- 1 agatakrajewska staff 3.3K 25 Sep 23:33 56def654ec22f857f480cdcc640c474e2f84d4be2e549a9d16eaba3f397596e9.json
drwxr-xr-x 5 agatakrajewska staff 160B 25 Sep 23:33 8bf067b107a6f7444876e33c6ed85652355f679ac98ebab97ab3ebad63f0dff3
-rw-r--r-- 1 agatakrajewska staff 356B 1 Jan 1970 manifest.json
-rw-r--r-- 1 agatakrajewska staff 89B 1 Jan 1970 repositories
Let's have a look at manifest.json
:
[ubuntu-docker] cat manifest.json | jq
[
{
"Config": "56def654ec22f857f480cdcc640c474e2f84d4be2e549a9d16eaba3f397596e9.json",
"RepoTags": [
"ubuntu:bionic"
],
"Layers": [
"8bf067b107a6f7444876e33c6ed85652355f679ac98ebab97ab3ebad63f0dff3/layer.tar",
"468327b5cd7ce539db695bd0ef05dae8a4ff77b02870a8e823ed74dedad4bd55/layer.tar",
"46076a325f0de3f745254638b8b0f0de343685b34e7ca6ec5cd0b6b7930eb7fa/layer.tar"
]
}
]
The file is pointing us at container config where we can find some super useful info, like architecture, configuration, root filesystem layers, etc. Also, probably the most interesting part from Snyk's point of view, the Layers
property is pointing us at actual image layers, which are just directories in the archive. If we change into one of these directories, we can see what it contains:
[8bf067b107a6f7444876e33c6ed85652355f679ac98ebab97ab3ebad63f0dff3] ll
total 128160
-rw-r--r-- 1 agatakrajewska staff 3B 25 Sep 23:33 VERSION
-rw-r--r-- 1 agatakrajewska staff 401B 25 Sep 23:33 json
-rw-r--r-- 1 agatakrajewska staff 63M 25 Sep 23:33 layer.tar
And dig a little deeper and unpack the layer.tar
:
[layer] ll
total 0
drwxr-xr-x 87 agatakrajewska staff 2.7K 21 Sep 18:17 bin
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 boot
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:17 dev
drwxr-xr-x 68 agatakrajewska staff 2.1K 21 Sep 18:17 etc
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 home
drwxr-xr-x 8 agatakrajewska staff 256B 23 May 2017 lib
drwxr-xr-x 3 agatakrajewska staff 96B 21 Sep 18:16 lib64
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 media
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 mnt
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 opt
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 proc
drwx------ 4 agatakrajewska staff 128B 21 Sep 18:17 root
drwxr-xr-x 5 agatakrajewska staff 160B 21 Sep 18:14 run
drwxr-xr-x 68 agatakrajewska staff 2.1K 21 Sep 18:17 sbin
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 srv
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 sys
drwxrwxrwt 2 agatakrajewska staff 64B 21 Sep 18:17 tmp
drwxr-xr-x 10 agatakrajewska staff 320B 21 Sep 18:14 usr
drwxr-xr-x 13 agatakrajewska staff 416B 21 Sep 18:17 var
As you can see above, the layer.tar
is just a filesystem changeset for the image layer, with all the dependencies and other binaries, depending on the image content. After we extract & analyze the content
s of those layers, we can show you the list of vulnerable paths.
Now let's snyk test
the saved Docker archive image, we can test a local archive by specifying docker-archive:
prefix:
snyk container test docker-archive:ubuntu-docker.tar
or scan the image from your local Docker repository:
snyk container test ubuntu:bionic
Both will produce exactly the same output:
Testing ubuntu:bionic...
✗ Low severity vulnerability found in tar
Description: Loop with Unreachable Exit Condition ('Infinite Loop')
Info: <https://snyk.io/vuln/SNYK-UBUNTU1804-TAR-312298>
Introduced through: tar@1.29b-2ubuntu0.1, pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2
From: tar@1.29b-2ubuntu0.1
From: pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2 > debconf@1.5.66ubuntu1 > perl/perl-base@5.26.1-6ubuntu0.3 > dpkg@1.19.0.5ubuntu2.3 > tar@1.29b-2ubuntu0.1
✗ Low severity vulnerability found in tar
Description: NULL Pointer Dereference
Info: <https://snyk.io/vuln/SNYK-UBUNTU1804-TAR-559435>
Introduced through: tar@1.29b-2ubuntu0.1, pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2
From: tar@1.29b-2ubuntu0.1
From: pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2 > debconf@1.5.66ubuntu1 > perl/perl-base@5.26.1-6ubuntu0.3 > dpkg@1.19.0.5ubuntu2.3 > tar@1.29b-2ubuntu0.1
# vuln list continues in here
...
Organization: example-org
Package manager: deb
Project name: docker-image|ubuntu
Docker image: ubuntu:bionic
Licenses: enabled
Tested 90 dependencies for known issues, found 31 issues.
Pro tip: use `--file` option to get base image remediation advice.
Example: $ snyk test --docker ubuntu:bionic --file=path/to/Dockerfile
OCI image
In the last few months we've also added OCI archives scanning to our list of features in the Snyk CLI.
Let's have a look at the spec now and see what we can find in the archive! I'm going to use ubuntu:bionic
for this section as well.
If you'd like to inspect an OCI archive tarball yourself, you can run a following command:
skopeo copy --override-os linux docker://ubuntu:bionic oci-archive:ubuntu.tar
I'm using Skopeo here to save the image, it is a cli
utility tool, which makes performing various operations on images super easy.
All we have to do, to save an ubuntu
image in an OCI format is to specify oci-archive:
prefix to our tar
output.
Notice how I'm also using the --override-os
flag—it's because I am on macOS
and official Docker Hub ubuntu images are only available for Linux. We will also talk about multiple architecture images below.
Once we unpack the tar archive, we can inspect the contents of it:
[ubuntu] ll
total 16
drwxr-xr-x 3 agatakrajewska staff 96B 28 Oct 11:18 blobs
-rw-r--r-- 1 agatakrajewska staff 186B 28 Oct 11:18 index.json
-rw-r--r-- 1 agatakrajewska staff 31B 28 Oct 11:18 oci-layout
We've got an interesting index.json
at the root, let's inspect it's contents:
[ubuntu] cat index.json | jq
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:afa93a8ce255ca452ca8c88f4b5c821a466cf0a3e0148a31d0d97dfdb91d9aef",
"size": 658
}
]
}
This is our (optional) higher-level manifest, which points us to specific image manifests, it contains information about a set of images that can span a variety of architectures and operating systems. If you're keen to learn more here's the full spec.
Let's now inspect our image specific manifest, which index.json
pointed us at, in blobs/sha256
dir:
[sha256] cat afa93a8ce255ca452ca8c88f4b5c821a466cf0a3e0148a31d0d97dfdb91d9aef | jq
{
"schemaVersion": 2,
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:33a51d09088285451e7a7525d4bd64fc15563264afe5a91ef84a8b3042018899",
"size": 2426
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:171857c49d0f5e2ebf623e6cb36a8bcad585ed0c2aa99c87a055df034c1e5848",
"size": 26701612
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:419640447d267f068d2f84a093cb13a56ce77e130877f5b8bdb4294f4a90a84f",
"size": 852
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:61e52f862619ab016d3bcfbd78e5c7aaaa1989b4c295e6dbcacddd2d7b93e1f5",
"size": 162
}
]
}
We're starting to see some interesting contents, like the container config, where we'll again find lots of useful pieces of information about the images architecture, base image layers, config etc. Also we have image layers info available, pointing us to the layers available in our ubuntu:bionic
image.
We can see below it's exactly the first layer's content is the same as above for docker archive.
[171857c49d0f5e2ebf623e6cb36a8bcad585ed0c2aa99c87a055df034c1e5848] ll
total 0
drwxr-xr-x 87 agatakrajewska staff 2.7K 21 Sep 18:17 bin
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 boot
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:17 dev
drwxr-xr-x 68 agatakrajewska staff 2.1K 21 Sep 18:17 etc
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 home
drwxr-xr-x 8 agatakrajewska staff 256B 23 May 2017 lib
drwxr-xr-x 3 agatakrajewska staff 96B 21 Sep 18:16 lib64
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 media
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 mnt
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 opt
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 proc
drwx------ 4 agatakrajewska staff 128B 21 Sep 18:17 root
drwxr-xr-x 5 agatakrajewska staff 160B 21 Sep 18:14 run
drwxr-xr-x 68 agatakrajewska staff 2.1K 21 Sep 18:17 sbin
drwxr-xr-x 2 agatakrajewska staff 64B 21 Sep 18:14 srv
drwxr-xr-x 2 agatakrajewska staff 64B 24 Apr 2018 sys
drwxrwxrwt 2 agatakrajewska staff 64B 21 Sep 18:17 tmp
drwxr-xr-x 10 agatakrajewska staff 320B 21 Sep 18:14 usr
drwxr-xr-x 13 agatakrajewska staff 416B 21 Sep 18:17 var
You can scan your local OCI archive by running:
snyk container test oci-archive:ubuntu.tar
Various platforms & architectures
When we were inspecting both Docker archive and OCI image manifests, we've noticed they both carried architecture information. Let's go back and have another look at our container config manifest's contents under ubuntu-docker
dir, where we saved our docker archive
:
[sha256] cat 56def654ec22f857f480cdcc640c474e2f84d4be2e549a9d16eaba3f397596e9.json | jq
{
...
"architecture": "amd64",
"os": "linux",
...
}
Among other pieces of information we can see architecture & os, which in my case is amd64
& linux
. What about if our image is a arm64
based image, or we are on windows
platform?
We're in luck, Snyk also supports other platforms, by passing -—platform
flag to the Snyk CLI.
We've talked about docker save
archive format, let's have a look at the format images are stored in, in Docker Hub, the v2 schema, which is very interesting under the hood. This is a second version of the v2 schema and it was created to support two primary goals. According to the Docker docs:
The first is to allow multi-architecture images, through a “fat manifest” which references image manifests for platform-specific versions of an image. The second is to move the Docker engine towards content-addressable images, by supporting an image model where the image’s configuration can be hashed to generate an ID for the image.
Using our local Docker, we can first inspect the "fat manifest" for the image we're interested in, to check which platform variants are available for us. As explained above, that means that a single repository can house multiple images for different architectures.
docker manifest inspect ubuntu:bionic
As a result we will see an array of manifests:
{
...
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 943,
"digest": "sha256:45c6f8f1b2fe15adaa72305616d69a6cd641169bc8b16886756919e7c01fa48b",
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 943,
"digest": "sha256:e80b8affb2361dc632c1fa8fcbf6b6514f750eb6ef99b7e7f825a55f849bfd89",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "v7"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 943,
"digest": "sha256:01a2038b20d165ab7df81934f9849bdfbc59bd6f6322c5d11e341504f66ec266",
"platform": {
"architecture": "arm64",
"os": "linux",
"variant": "v8"
}
},
...
Now, that we've established ubuntu
image comes for various different platforms, let's see how to scan ubuntu
image for arm64
platform architecture.
Make sure you have experimental features enabled in Docker on your host and you can run:
snyk container test --platform=linux/arm64 ubuntu:bionic
Let's have a look at the output:
Testing ubuntu:bionic...
✗ Low severity vulnerability found in tar
Description: Loop with Unreachable Exit Condition ('Infinite Loop')
Info: <https://snyk.io/vuln/SNYK-UBUNTU1804-TAR-312298>
Introduced through: tar@1.29b-2ubuntu0.1, pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2
From: tar@1.29b-2ubuntu0.1
From: pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2 > debconf@1.5.66ubuntu1 > perl/perl-base@5.26.1-6ubuntu0.3 > dpkg@1.19.0.5ubuntu2.3 > tar@1.29b-2ubuntu0.1
✗ Low severity vulnerability found in tar
Description: NULL Pointer Dereference
Info: <https://snyk.io/vuln/SNYK-UBUNTU1804-TAR-559435>
Introduced through: tar@1.29b-2ubuntu0.1, pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2
From: tar@1.29b-2ubuntu0.1
From: pam/libpam-runtime@1.1.8-3.6ubuntu2.18.04.2 > debconf@1.5.66ubuntu1 > perl/perl-base@5.26.1-6ubuntu0.3 > dpkg@1.19.0.5ubuntu2.3 > tar@1.29b-2ubuntu0.1
# vuln list continues in here
...
Organization: example-org
Package manager: apk
Project name: docker-image|ubuntu:bionic
Docker image: alpine:3.12
Platform: linux/arm64
Licenses: enabled
Tested 90 dependencies for known issues, found 31 issues.
Pro tip: use `--file` option to get base image remediation advice.
Example: $ snyk test --docker ubuntu:bionic --file=path/to/Dockerfile
We can see all the vulnerable paths discovered, and also 'Platform' information in the output of our scan.
So in this blog post, we've explored the two main container image formats in Docker and OCI and shown how Snyk can interact with both of them, even across multiple architectures.
To wrap up, keep up with security best practices for building optimal Docker images for Node.js and Java applications:
10 Docker Security Best Practices — details security practices that you should follow when building docker base images and when pulling them too, as it also introduces the reader to docker content trust.
Are you a Java developer? You’ll find this resource valuable: Docker for Java developers: 5 things you need to know not to fail your security
10 best practices to containerize Node.js web applications with Docker - If you’re a Node.js developer you are going to love this step by step walkthrough, showing you how to build secure and performant Docker base images for your Node.js applications.
All this functionality is available in Snyk for free — sign up for an account.
Get started in capture the flag
Learn how to solve capture the flag challenges by watching our virtual 101 workshop on demand.