Saturday, December 17, 2022

Container Memory Usage

How efficient is it to run multiple containers with the same code, serving different data?  I am most familiar with a “shared virtual hosting” setup, with a few applications behind a single Web frontend and PHP-FPM pool.  How much would I lose, trying to run each app in its own container?

To come up with a first-order approximation of this, I made a pair of minimal static sites and used the nginx:mainline-alpine image (ID 1e415454686a) to serve them.  The overall question was whether the layer would be shared between multiple containers in the Linux memory system, or whether each image would end up with its own copy of everything.

Updated 2022-12-19: This post has been substantially rewritten and expanded, because Ubuntu would not cleanly reproduce the numbers, requiring deeper investigation.

Test Setups

The original test ran on Debian 11 (Bullseye) using Podman, rootless Podman, the “docker.io” package, and uncontained nginx-mainline.  The first three are from the Debian 11 package repository.  Due to installing the VM guests on separate days, the Podman VM had kernel 5.10.0-19, and the Docker and uncontained VMs had kernel 5.10.0-20.  Debian VMs were configured with 2 CPUs, 1024 MiB of RAM, and a 1024 MiB swap partition (unneeded.)

The Ubuntu test ran on Ubuntu 22.10 (Kinetic) with Podman, rootless Podman, and “docker.io” only; the uncontained test was not reproduced.  Ubuntu also used the fuse-overlayfs package, which was not installed on Debian, so rootless Podman shows different sharing behavior in the lsof test.

The following versions were noted on the Ubuntu installations: docker.io used docker.io 20.10.16-0ubuntu1, containerd 1.6.4-0ubuntu1.1, and runc 1.1.2-0ubuntu1.1.  podman used podman 3.4.4+ds1-1ubuntu1, fuse-overlayfs 1.9-1, golang-github-containernetworking-plugin-dnsname 1.3.1+ds1-2, and slirp4netns 1.2.0-1.

In an attempt to increase measurement stability, ssh, cron, and unattended-upgrades services were all stopped and deactivated on Ubuntu.

Procedures

The test cycle involved cold-booting the appropriate VM, logging into the console, checking free, and starting two prepared containers.  (The containers were previously run with a bind-mount from the host to the container’s document root.) I accessed the Web pages using links to be sure that they were working properly, and then alternately checked free and stopped each container.  I included a sleep 1 command between stopping the container and checking the memory, to give the container a chance to exit fully.

On a separate run from finding the memory numbers, I also used lsof to investigate what the kernel reported as open files for the containers.  In particular, lsof provides a “NODE” column with the file’s inode number.  If these are different for the same file in the container image, then it shows that the container is not sharing the files.

The uncontained test is similar: boot, login on the console, check RAM, start nginx.service, access the pages, check the memory, stop nginx.service, and check the memory.  The lsof research does not apply; multiple nginx instances do not exist.

Due to memory instability observed in the first round of Ubuntu testing, the tests were repeated with ps_mem used to observe the PIDs associated with the containers, in order to get a clearer view of RAM usage of the specific containers.

Finally, a separate round of tests was done with ps_mem again, to get the breakdown by process with both containers running.

Debian Results

Limitation: I used free -m which was not terribly precise.

The Podman instance boots with 68-69 MiB of RAM in use, while the Docker instance takes 122-123 MiB for the same state (no containers running.)

Rootless Podman showed different inode numbers in lsof, and consumed the most memory per container: shutting things down dropped the used memory from 119 to 96 to 72 MiB.  Those are drops of 23 and 24 MiB.

Podman in its default (rootful) mode shows the same inode numbers, and consumes the least memory per container: the shutdown sequence went from 77 to 75 to 73 MiB, dropping 2 MiB each time.

Docker also shows the same inode numbers when running, but falls in between on memory per container: shutdown went from 152 to 140 to 129 MiB, which are drops of 12 and 11 MiB.

In the uncontained test, for reference, memory was difficult to measure.  On the final run, free -m reported 68 MiB used after booting, 70 MiB while nginx was running, and 67 MiB after nginx was stopped.  This is reasonable, since the nginx instance shares the host’s dynamic libraries, especially glibc.

Ubuntu Results

In the interests of being open and transparent about the quality of the methodology, the discredited data is also being reported here.

Rootless Podman

        (KiB) used    free     shared
boot          184360  1501076  1264
1 container   183144  1494032  1336
2 containers  205780  1470908  1400
1 container   177096  1493032  1356
final         190840  1479180  1312

Note that overall memory usage goes “down” after starting the first container, and “up” when stopping the second container.

The ps_mem results for slirp4netns, containers-rootlessport(-child), conmon, and the nginx processes:

2 containers  64.3 MiB RAM
1 container   46.0 MiB

Matching the Debian results, rootless podman adds significant memory overhead (39.8% or 18.3 MiB) in this test.

With fuse-overlayfs installed, lsof showed the same inode numbers being used between the two containers, but on different devices.  Previously, on Debian, they appeared on the actual SSD device, but with different inode numbers.  The “same inodes, different device” matches the results when running the containers in rootful mode on Ubuntu.  I did not pay attention to the device numbers in rootful mode on Debian.

The two-container breakdown (note again, this is a separate boot from the previous report, so does not total the 64.3 MiB shown above):

Private  Shared   Sum      Processes
708.0 K  434.0 K  1.1 M    conmon
664.0 K  561.0 K  1.2 M    slirp4netns
2.5 M    5.4 M    7.9 M    nginx
15.4 M   12.2 M   27.6 M   podman
15.3 M   12.6 M   27.9 M   exe
                  65.6 MiB total

“podman” corresponds to containers-rootlessport-child in the output of ps, and “exe” is containers-rootlessport.

Rootful Podman

        (KiB) used    free     shared
boot          174976  1503404  1268
1 container   183880  1478800  1352
2 containers  194252  1467796  1420
1 container   164008  1497780  1372
final         184480  1477396  1324

The measurement problem was even more dramatic.  Memory usage plummeted to “lower than freshly booted” levels after stopping one container, then bounced back up after stopping the second container.  Neither of these fit expectations.

Rootful podman only needs the conmon and nginx processes, which leads to the following ps_mem result:

2 containers  9.1 MiB RAM
1 container   6.6 MiB

The overhead remains high at 37.9%, but it is only 2.5 MiB due to the much lower starting point.

Here’s the breakdown with both containers running:

Private  Shared   Sum      Processes
708.0 K  519.0 K  1.2 M    conmon
2.6 M    5.4 M    8.0 M    nginx
                  9.2 MiB total

Without the containers-rootlessport infrastructure, memory usage is vastly lower.

Docker

        (KiB) used    free     shared
boot          192088  1430660  1104
1 container   213896  1355964  1192
2 containers  246264  1322052  1280
1 container   245940  1322052  1192
final         194276  1373788  1104

Calculating the deltas would suggest 21.3 MiB and 31.6 MiB to start the containers, but then 0.32 MiB and 50.4 MiB released when shutting them down.

Testing with ps_mem across all the container-related processes (docker-proxy, containerd-shim-runc-v2, and the nginx main+worker processes), I got the following:

2 containers 25.4 MiB RAM
1 container  19.3 MiB

That suggests that the second container added 31.6% overhead (6.1 MiB) to start up.

The breakdown for 2-container mode:

Private Shared Sum     Processes
2.8 M   1.7 M   4.4 M  docker-proxy
2.8 M   5.3 M   8.0 M  nginx
5.0 M   7.9 M  12.9 M  containerd-shim-runc-v2
               25.4 MiB total

We see that containerd-shim-runc-v2 is taking just over half of the memory here.  Of the rest, a third goes to docker-proxy, leaving less than one-third of the total allocation dedicated to the nginx processes inside the container.

Uncontained

I only collected stats for ps_mem this time:

Private Shared Sum    Processes
1.4 M   1.6 M  3.0 M  nginx

This configuration is two document roots served by one nginx setup, rather than two nginx setups, so isolation is even lower than simply being uncontained.  However, it represents a lower bound on what memory usage could possibly be.

Conclusions

Running podman rootless costs quite a bit of memory, but running it in rootful mode beats Docker’s consumption.

Both container managers can share data from the common base layers while running in memory, but Podman may require fuse-overlayfs to do so when running rootless.

For every answer, another question follows.  It’s not that the project is finished; I simply quit working on it.

No comments: