Sunday, June 20, 2021

What is @system-service?

Because it's subject to change, systemd does not officially publish what makes up "@system-service".  However, as of June 20th, 2021, the git repository defines the system calls allowed in src/shared/seccomp-util.c.  In addition to some more calls, @system-service (SYSCALL_FILTER_SET_SYSTEM_SERVICE) currently includes the following groups:

@aio, @basic-io, @chown, @default, @file-system, @io-event, @ipc, @keyring, @memlock, @network-io, @process, @resources, @setuid, @signal, @sync, and @timer.

It may be more interesting to consider what it doesn't include:

@clock, @cpu-emulation, @debug, @module, @mount, @obsolete, @pkey, @privileged, @raw-io, @reboot, and @swap.

This is not a list of what it "excludes," because system calls may appear on multiple lists; "chown" is permitted by @chown, and the non-inclusion of @privileged doesn't revoke that permission.

Some further observations:

  • @aio includes io_uring.
  • @raw-io is port-based I/O, and configuration of PCI MMIO.
  • @basic-io is read/write and related calls; @file-system is a much larger group that covers access, stat, chdir, open, and so on.
  • @io-event is evented I/O, that is, select/poll/epoll.

Essentially, @system-service is meant to permit a fairly wide range of operations.  This makes it easier to start using (less likely to break things), at the cost of potentially leaving the gates open more than necessary.

As for my own systems, I may start revoking @privileged and @memlock from daemons that run as a non-root user from the start.  Other than that, this exercise didn't turn out to be very informative.

Update on 2021-06-25: The command systemd-analyze syscall-filter will show all of the built-in sets; systemd-analyze syscall-filter @privileged will show the filters for a specific set (in this example, the "@privileged" list of calls.)