Intro
The Sysbox container runtime improves two key aspects of containers:
It enhances container isolation (e.g., by using the Linux user-namespace).
It enables them to act as “virtual-machine like� environments capable of seamlessly running
software such as systemd, Docker, and Kubernetes in them. We call these system containers
and they provide an alternative to virtual machines (VMs) in many scenarios.
One of the questions that arises is around the performance of Sysbox and whether
this extra isolation adds overhead and how it compares to virtual machines
(VMs).
This blog post sheds light on this by comparing the performance of workloads in 4 scenarios:
1) A container on bare metal.
2) A system container on bare metal.
3) A container inside a system container.
4) A container inside a virtual machine (VM).
The figure below illustrates each of these:
The comparison between (1) and (2) yields the overhead of a system container vs a regular Docker container.
The comparison between (1) and (3) yields the overhead of running a container inside a system container (i.e., nested).
The comparison between (1) and (4) yields the overhead of running a container inside a VM.
And the comparison between (3) and (4) yields the overhead of system containers vs. VMs.
As you’ll see, system containers deployed with Sysbox have minimal overhead compared
to regular Docker containers, but are much more efficient than VMs.
This means that using Sysbox to enhance a container’s isolation comes with no
performance penalty, while using Sysbox to deploy containers that replace VMs
yields much better use of the underlying hardware and can reduce infrastructure
costs significantly.
Contents
Performance Test Setup
Workload Performance
Memory Utilization
Storage Utilization
Provision Time
Density
Conclusion
Performance Test Setup
Before describing the results, let’s describe the setup we used to obtain them.
The performance test was conducted on a physical server with the following characteristics:
Attribute
Description
Processor Type
Dual Socket Intel E5-2630 v3 (Intel Haswell)
Number of CPU threads
32
CPU frequency
1.2 GHz
Memory
128 GB
Storage
1 TB (SSD)
OS
Ubuntu Linux 20.04 (Focal)
To obtain the results, we launched several test runs. In each test run, we launched
a number of “instances� in parallel, where the instances are all of the same
type and correspond to one of the 4 scenarios shown in the prior section.
For example, in one test run we launched several instances of regular containers on
bare metal each of which runs a benchmark. In an another test run we repeated this
but using system containers. And so on.
The regular containers are launched with Docker. The system containers are
launched with Docker + Sysbox. And the VMs are launched with Vagrant + libvirt +
KVM + QEMU.
To ensure an apples-to-apples comparison, the VM instances are Ubuntu Linux
20.04 (Focal) and the system container instances are also based on a Ubuntu
Focal Docker image.
Each instance is assigned 2 CPUs and 2GB of RAM. In the case of the VM, this is
done by configuring the VM’s virtual hardware. In the case of the container
instances, it’s done via the docker run --cpus=2 -m 2G options.
The workloads running inside each instance are from the Phoronix Test Suite,
a well known performance test suite for Linux.
Workload Performance
The figure below shows the performance of a CPU-bound workload (Gzip). The
Y-axis is the performance metric (throughput), while the X-axis is the number of
instances of a given type (i.e., corresponding to the benchmark scenarios
described above).
As shown the performance of running a container on bare metal, inside a system
container, or inside a VM is very similar, except when the machine’s CPU is
overcommitted (i.e., when the instances are assigned more CPUs than are actually
available in the system). At that point, the performance degrades quickly, in
particular for VMs (likely due to the overhead of context switching VMs across CPUs).
The next figure shows the performance of a disk IO bound workload (i.e., a workload
that performs many reads and writes of files).
As shown, for disk IO bound workloads, the performance of running a container on
bare metal or inside a system container is very similar. In contrast, running a
container inside a VM is up to 40% slower. This is due to the fact that the VM
carries a full guest OS and emulates hardware in software, meaning that each
access to storage goes through two IO stacks: one in the guest OS and one in the
host OS.
The next figure shows the performance of a network IO workload (using the well
known “iperf� benchmark). These results were obtained by configuring a number of
instances as iperf servers and directing traffic to them from a similar number
of iperf clients.
As you can see, the performance of a container or system container on bare metal
is the same. However, running a container inside a system container does incur
a performance hit of 17% (as the inner container’s packets must now go through
an extra Docker bridge).
But notice that if we run that same container in a VM, the performance
hit is an additional 65%. This is due to the fact that the packets of the
container inside the VM must go through the Docker bridge inside the VM, the
guest OS, the virtualized NIC, and the host OS network stack.
Memory Utilization
The figure below shows a comparison of the host machine’s memory utilization:
As you can see, memory utilization for running containers on bare-metal
or inside a system container is quite low.
However, for VMs it’s much higher. The reason is that when processes inside the
VM allocate and free memory, this memory is allocated at the host level but
never freed. This is because the hypervisor knows when the guest OS is
allocating pages, but does not know when the guest OS has freed them. As a
result, you end up with a situation where memory is allocated at host level even
though it’s not necessarily in use by the VM’s guest OS.
Some hypervisors are capable of automatically reclaiming unused guest OS memory
via “memory ballooning� inside the VM’s guest OS. But this technique is
imperfect and can be computationally expensive. As a result, it’s generally
recommended that host machines running VMs have enough physical RAM to satisfy
the combined total of all memory assigned to the VMs. For example, if you have a
host machine where you plan to run 16 VMs each with 8GB of RAM, you generally
need 128GB of RAM.
This inefficiency does not exist when running workloads in regular containers or
system containers. As processes inside the containers allocate and later free
memory, the memory is immediately available for other processes in the
host. This means you use memory more efficiently and thus can achieve same
performance with less memory on the host machine. For example, as the diagram
shows, running 16 system container instances with this particular workload
consumes only 2GB of RAM, while using VMs consumes 32GB of RAM (a 16x
increase). Depending on the workload this savings factor changes, but it’s clear
that containers are much more efficient in this regard.
Storage Utilization
The figure below shows a comparison of the host machine’s storage utilization.
As you can see, the storage overhead for containers is very low, but it’s
significantly higher for VMs.
This is because the copy-on-write (COW) technique used by Docker containers is
much more efficient than the COW used by VMs on the virtual disk. In this test,
each VM instance is adding ~650MB of storage overhead, even though they are all
based on the same “master� image. In contrast, containers add only a few MBs for
each instance.
One caveat: this experiment used the Sysbox Enterprise Edition (Sysbox EE)
to deploy the system containers. Sysbox EE has an optimization that makes it much
more efficient than the Sysbox Community Edition (Sysbox CE)
when running Docker containers inside a system container. If we had used Sysbox
CE, the storage overhead for scenario (3) (the gray bar) would have risen to
around 40% that of the VMs. This is not shown in the diagram.
Provisioning Time
The next figure compares the time it takes to provision VMs versus system system
containers. That is, how long it takes for each instance to be up and running
and ready to do work.
As you can see, the system containers provision 10x faster than VMs. This is
because VMs carry the overhead of instantiating the VM and booting the guest OS,
while emulating hardware in software. In contrast, with system containers the
setup is much simpler: create the container’s abstraction (namespaces, chroot,
cgroups) and start the container init process.
This means system containers give developers a much more agile way to create
virtual host environments compared to VMs, not to mention the superior
portability since they are lighter weight and are not tied to a specific
hypervisor, but rather to Linux.
Server Consolidation
Our performance analysis indicates that for IO intensive workloads, it’s
possible to run twice the number of system containers per host as compared to
VMs while getting the same performance. For CPU intensive workloads, it’s
possible to run 30% or more system containers per host as compared to VMs.
And in both cases, using system containers incurs only a fraction of the
memory and storage overhead as compared to VMs (as shown in the prior sections).
This means that replacing VMs with system containers as a way of deploying
isolated OS environments can result in up to 50% cost reduction in your
hardware. For example if you needed 10 servers to run the VMs, you only need 5
or 6 servers when using system container