Friday, August 27, 2021

Our brief use of systemd's TemporaryFileSystem feature

Late last year, we started having problems where services did not appear to get proper data after a reload.  But I have a hypothesis.

We have a monolithic server build, where several services end up running under one Apache instance.  I thought it would be nice to make each one think it was the only service running.  Instead of enumerating badness, I switched to using TemporaryFileSystem= to eliminate all directories, then allowing access the one needed for the service.

The problem is, I don’t think this gets reset/updated during a reload. Deployment renames the working directory, and renames an all-new directory in at the original name.  It’s possible that even though the reload happens, the processes are not changing working directory, thus still looking in the original one.

Things that have never failed were failing; in particular, Template Toolkit was still using stale template files.  I did some digging and found that it does have a cache by default, but it only holds entries for up to a second. Furthermore, because I was very strict about not having unexpected global state, we create a new Template object per request, and shouldn't be sharing the cache.

Anyway, systemd did not actually document what it does with TemporaryFileSystem on reload, so I can’t be sure.  I ended up abandoning TemporaryFileSystem= entirely.  I only have so much time to mess around, and there’s only so much “risk” that would be mitigated by improved isolation. It’s almost guaranteed that the individual applications have defects, and don’t need a multi-app exploit chain.

(Not in the sense that I know of any specific defects; in the sense that all software has defects, and some proportion of those can be leveraged into a security breach.)

Friday, August 20, 2021

More Fun with Linux I/O Stats

I did not disclose boot times or time spent reading with my previous post. This is because the Linux drive is, in fact, on SATA 2 and not SATA 3.  My motherboard only has a single SATA 3 (6 Gbps) port, and Windows was already on it.  I installed a second drive for Linux, which means that the new SSD went on SATA 2 instead.

However, I did temporarily swap the cables, and take some numbers for Linux booting with SATA 3 instead.  This occurred on Ubuntu Studio 21.04, which is the version that I have installed on the bare metal, rather than inside a virtual machine.

The boot time for Ubuntu Studio from Grub-to-desktop was at most 1 second faster on SATA 3, although Linux did report half the time spent reading in /proc/diskstats.

From this, I have to conclude that the disk is not a limiting factor: whether the disk is spending 5.5 or 11 seconds to return data leads to boot times of 13.6 vs 14.5 seconds.  Cutting the bandwidth in half only raises the overall boot time by 6.6%.

In fact, the last boot before I uninstalled VirtualBox, but on SATA 3, recorded 40914 I/O requests, 1958574 sectors, and 10917 ms waiting for the disk, for a 16.0 second boot.  After merely uninstalling VirtualBox and rebooting, there were 16217 requests (60% fewer), 1256806 sectors (36% fewer), 5698 ms waiting (48% less), and 13.8 seconds wall time (14% less).  For what it’s worth, “used memory” reported just after the /proc/diskstats check dropped from 673 MB to 594 MB (12% less).

In other words, the interface is less relevant than whether VirtualBox is installed.

(The virt-manager + Cinnamon experiment was caused by the removal of VirtualBox, but happened after I/O numbers were finished.  Also undisclosed previously: the drive itself is a Samsung 870 EVO 250 GB, so its performance should be fairly close to the limits of SATA 3.)

The conclusion remains the same; more speed on the disk is not going to help my current PC.  My next PC (whether Windows 10 or the hardware gives way first) will support NVMe, but I probably won't use it unless there are a couple of M.2 slots.  Being able to clone a drive has been incredibly useful to me.

Friday, August 13, 2021

Making Sense of PHP HTTP packages

There are, at the moment, several PSRs (PHP Standard Recommendations) involving HTTP:

  1. PSR-7: HTTP messages.  These are the Request, Response, and Uri interfaces.
  2. PSR-17: HTTP message factories.  These are the interfaces for “a thing that creates PSR-7 messages.”
  3. PSR-18: HTTP clients.  These are the interfaces for sending out a PSR-7 request to an HTTP server, and getting a PSR-7 response in return.
  4. PSR-15: HTTP handlers.  These are interfaces for adding “middleware” layers in a server-side framework. The middleware sits “between” application code and the actual Web server, and can process, forward, and/or modify requests and responses.

I will avoid PSR-15 from here on; the inspiration for this post was dealing with the client side of the stack, not the server.

As for the rest, the the two ending in 7 are related: PSR-17 is about creating PSR-7 objects, without the caller knowing the specific PSR-7 objects. However, it’s a kind of recursive problem: how does the caller know the specific PSR-17 object to use?

There's also some confusion caused by having “Guzzle” potentially refer to multiple packages.  There’s the guzzlehttp/guzzle HTTP client, and there’s a separate guzzlehttp/psr7 HTTP message implementation.  Unrelated packages may use guzzle in their name, but not be immediately clear on which exact package they work with.

  • php-http/guzzle7-adapter is related to the client.
  • http-interop/http-factory-guzzle provides PSR-17 factories for the Guzzle PSR-7 classes, and has nothing to do with the client.

Additionally, whether these packages are needed has changed somewhat over time.  guzzlehttp/psr7 version 2 has PSR-17 support built in, and guzzlehttp/guzzle version 7 is compatible with PSR-18.  Previous major versions of these packages lacked those features.

Discovery

The problem of finding PSR-compatible objects is solved by an important non-PSR library from the HTTPlug (HTTP plug) project: php-http/discovery (and its documentation.)

  • It lets code ask for a concrete class by interface, and Discovery takes care of finding which class is available, constructing it, and returning it.
  • It includes its own list of PSR-17 factories and PSR-18 clients, and can return those directly, where applicable. When Guzzle 7 is installed, Discovery (of a recent enough version) can return the actual \GuzzleHttp\Client when asked to find a PSR-18 client.
  • It has additional interfaces for defining and finding asynchronous HTTP clients, where code is not required to wait for the response before processing continues.

At its most basic, Discovery can find PSR-17 factories and PSR-18 HTTP clients.  These would be loaded through the Psr17FactoryDiscovery and Psr18ClientDiscovery classes.  For the more advanced features like asynchronous clients, the additional adapter packages are required.

For example, to use Guzzle 7 asynchronously, php-http/guzzle7-adapter is required.  At that point, it can be loaded using HttpAsyncClientDiscovery::find().  This method then returns an adapter class, which implements php-http's asynchronous client interface, and passes the actual work to Guzzle.

In any case, library code itself would only require php-http/discovery in its composer.json file; a project making use of the library would need to choose the concrete implementation to use in its composer.json file.

An Important Caveat

Discovery happens at run time.  Since Discovery supports a lot of ways to find a number of packages, it doesn't depend on them all, and it doesn't even have a hard dependency on, say, the PSR-17 interfaces themselves.  This means that Composer MAY install packages, even though requirements aren't fully met to make all of them usable.

To be sure the whole thing will actually work in practice, it's important to make some simple, safe HTTP request.  In my case, I use the Mailgun client to fetch recent log events.

When code using Discovery fails, the error message may suggest installing “one of the packages” from a list, and providing a link to Packagist.  That link may include the name of a package that is installed.  Why doesn’t it work? It’s probably the version that is the culprit.  If guzzlehttp/psr7 version 1 is installed, but not http-interop/http-factory-guzzle, then the error is raised because there is genuinely no PSR-17 implementation available with the installed versions. However, the guzzlehttp/psr7 package will be shown on Packagist as providing the necessary support, because the latest version does, indeed, support PSR-17.

Things Have Changed a Bit Over Time

As noted above, prior to widespread support for PSR-17 and PSR-18, using the php-http adapters was crucial for having a functioning stack.  So was installing http-interop/http-factory-guzzle to get a PSR-17 adapter for the guzzlehttp/psr7 version 1 code.

For code relying only on PSR-17 and PSR-18, and using the specific Discovery classes for those PSRs, the latest Guzzle components should not need any other packages installed to work.

However, things can be different, if there is another library in use that causes the older version of guzzlehttp/psr7 to be used.  This happens for me: the AWS SDK for PHP specifically depends on guzzlehttp/psr7 version 1, so I need to include http-interop/http-factory-guzzle as well, for Mailgun to coexist with it.

One Last Time

If you’re writing a library, use and depend on php-http/discovery.

If you’re writing an application, you must also depend on specific components for Discovery to find, such as guzzlehttp/guzzle.  Depending on how the libraries you are using fit together, you may also need http-interop/http-factory-guzzle for an older guzzlehttp/psr7 version, or a package like php-http/guzzle7-adapter if a simple PSR-18 client isn’t suitable.

There are alternatives to Guzzle, but since my projects are frequently AWS-based, and their SDK depends on Guzzle directly, that’s the one I end up having experience with.  I want all of the libraries to share dependencies, where possible.

Friday, August 6, 2021

Cinnamon with Accelerated Graphics in libvirt

While exploring virt-manager and Linux Mint Cinnamon Edition, I kept getting the warning that software rendering was in use (instead of hardware acceleration.) The warning asks to open the Driver Manager, which then says no drivers are needed.

The open-source stack for doing accelerated graphics in this case is SPICE, but virt-manager had configured a different graphics stack by default for the “Ubuntu 20.04” OS setting I had used.

For the guest to use SPICE, it needed the spice-vdagent package to be installed.  After that, the guest was shut down, so that its hardware could be reconfigured.

First, the “video” element needed to be converted from QXL to virtio driver, which includes a 3D acceleration option.  Once the latter option’s checkbox was ticked, I clicked Apply to save the configuration.

Next, the “display” needed to be changed to support direct OpenGL rendering. I needed to change the Listen to none, and to tick the OpenGL checkbox. (Doing this in the opposite order, an “invalid settings” message appeared, saying that OpenGL is not supported unless Listen is none.) This revealed an option to select a graphics device, but I only have one, so I left it alone. I then applied the display changes.

Finally, I booted the guest again.  Everything seems to have worked, and the warning did not appear.

Wednesday, July 28, 2021

Fun with Linux I/O Stats

A discussion with my wife about random I/O and how I’m not sure NVMe would make a measurable difference to my life got me to wondering, how much I/O does a modern Linux system perform during bootup?

I have Ubuntu Studio 21.04 installed on hardware, and it issues around 18,800 read requests for 1.5 million sectors to boot to the desktop (and to start Alacritty to cat /proc/diskstats.)

I then installed Linux Mint 20.2 Cinnamon Edition and Xubuntu 21.04 in virtual machines, and ran similar tests, using their native terminal emulators.  Mint issues around 19,000 read requests for 1.2 million sectors.

Xubuntu issues about 20,500 read requests for 1.4 million sectors.

To be fair, Xubuntu does win the “least RAM used” contest.  After checking the diskstats file, free -m indicated 428 MB in use on Xubuntu, 547 MB on Mint Cinnamon, and 637 MB on Ubuntu Studio.

Xubuntu was a surprising disappointment; besides issuing the most read requests to boot up, it lacked in features.  The keyboard layout wouldn’t work.  I also discovered along the way that its live environment can’t control the backlight on a “late 2012” iMac, unlike mainline Ubuntu.

(Originally, I was hoping to get the info from the live environment, but the automatic media check dashed my hopes.)

Based on the information so far, I don’t have any new evidence that NVMe would be much of a performance boost.  Sure, the numbers are bigger, but I’m looking for a life-altering wow factor, like going from HDD to SSD was.

Added 2021-07-29: Synthetic benchmarks don’t seem to add up to real-world performance.  (As usual!) It makes great gains in CrystalDiskMark, but actual tasks can turn game load times from 36 to 30 seconds, or a 1.8 GB Photoshop file from 18 to 11 seconds.  To be fair, that seems to be on the order of magnitude of Linux booting up, which I’m largely interested in as the most massive task I tend to put my hardware to.  It’s not like the SATA SSD is so slow that my mind wanders while waiting for Firefox to load.

For something that’s marketed as if it’s five to ten times faster, that’s a disappointingly weak effect in practice.  And worse, a PCIe 4.0 drive can cost more than double of what the SATA variant does, with the same manufacturer and capacity.  For that kind of money, I want the performance to double across the board.  But I’m getting older, wiser, and grumpier.

Sunday, June 20, 2021

What is @system-service?

Because it's subject to change, systemd does not officially publish what makes up "@system-service".  However, as of June 20th, 2021, the git repository defines the system calls allowed in src/shared/seccomp-util.c.  In addition to some more calls, @system-service (SYSCALL_FILTER_SET_SYSTEM_SERVICE) currently includes the following groups:

@aio, @basic-io, @chown, @default, @file-system, @io-event, @ipc, @keyring, @memlock, @network-io, @process, @resources, @setuid, @signal, @sync, and @timer.

It may be more interesting to consider what it doesn't include:

@clock, @cpu-emulation, @debug, @module, @mount, @obsolete, @pkey, @privileged, @raw-io, @reboot, and @swap.

This is not a list of what it "excludes," because system calls may appear on multiple lists; "chown" is permitted by @chown, and the non-inclusion of @privileged doesn't revoke that permission.

Some further observations:

  • @aio includes io_uring.
  • @raw-io is port-based I/O, and configuration of PCI MMIO.
  • @basic-io is read/write and related calls; @file-system is a much larger group that covers access, stat, chdir, open, and so on.
  • @io-event is evented I/O, that is, select/poll/epoll.

Essentially, @system-service is meant to permit a fairly wide range of operations.  This makes it easier to start using (less likely to break things), at the cost of potentially leaving the gates open more than necessary.

As for my own systems, I may start revoking @privileged and @memlock from daemons that run as a non-root user from the start.  Other than that, this exercise didn't turn out to be very informative.

Update on 2021-06-25: The command systemd-analyze syscall-filter will show all of the built-in sets; systemd-analyze syscall-filter @privileged will show the filters for a specific set (in this example, the "@privileged" list of calls.)

Tuesday, May 4, 2021

aws-vault on macOS Catalina

I recently installed aws-vault on my old iMac, for malware hardening.  If any finds me, I don’t want it to be able to pick up ~/.aws/credentials and start mining cryptocurrency.

One problem: there were a lot of password prompts, and the “aws-vault” keychain did not appear in Keychain Access as promised.

But it’s okay; there’s a security command that works with it, if we know the secret internal name of the keychain: aws-vault.keychain-db.  (I found this by looking in my ~/Library directory for the keychains; most of them have several files, but aws-vault has only that one.)

As in:

  1. security show-keychain-info aws-vault.keychain-db
  2. security set-keychain-settings -l -u -t 900 aws-vault.keychain-db
  3. security lock-keychain aws-vault.keychain-db
  4. security set-keychain-password aws-vault.keychain-db (this will prompt for current and new passwords)

Regarding the set-keychain-settings subcommand.  Each option specifies a setting: -l to lock the keychain whenever the screen is locked, -u to lock the keychain after a time period, and -t {seconds} to set that time period. So, in the example, we are setting the keychain to lock again after 900 seconds.  There do not seem to be opposites of these options documented, so I would guess running the command without a flag will remove that option.  I haven’t tested that, though.

I added the lock-keychain subcommand to my logout file [~/.zlogout for zsh, but formerly ~/.bash_logout].  Whenever I exit an iTerm2 tab, everything is locked again.  (I’ve been clearing SSH keys from the agent in that file for a long time.)

I’ve chosen to “Always Allow” aws-vault to access the keychain, which means I need to put in the password only for the unlocking operation, not for every time the binary wants to access information within the keychain.