Wednesday, March 31, 2021

The sea change to vendoring and containers

Not so long ago, we would install code to the global runtime path, where it would be available machine-wide.  CPAN was perhaps the earliest to do this, but PEAR, rubygems, and probably others followed.  After all, CPAN was seen as a great strength of Perl.

Users of such a library would say something like use Image::EXIF; or include_once "Smarty/Smarty.class.php"; Notice that these are a mechanism to follow some include path to resolve relative filenames.

But as servers got larger faster than applications, co-installing those applications became attractive.  The problem was that they can have different dependency requirements, which complicates the management of a shared include path.

This is about where Bundler happened for Ruby, and its ideas were brought to other ecosystems: Perl got carton, and PHP got composer. These systems build a directory within the project to hold the dependencies, and basically ignore the global include path entirely.

At the cost of disk space, this bypasses the problems of sharing dependencies between projects.  It bypasses all the problems of working with multiple versions of dependencies that may be installed in different hosts’ include paths.  It also makes the concept of the language’s include path obsolete, at least in PHP’s case: the Composer autoloader is a more-capable “include path” written in code.  Finally, it allows piecemeal upgrades—a small service can make a risky upgrade and gain experience, before a larger or more important service on the machine needs to upgrade.

Piecemeal upgrades also enable independent teams to make more decisions on their own.  Since their dependencies are not shared with other users, changes to them do not have to be coordinated with other users.

Containers are the next step in the evolution of this process.  Pulling even more dependencies into a container brings all the advantages of local management of packages into a broader scope.  It allows piecemeal upgrading of entire PHP versions.  And it makes an application cost an order of magnitude more space, once again.

In our non-containerized environment, I can switch the PHP version on a host pretty easily, using co-installable versions and either update-alternatives (for the CLI) or a2disconf/a2enconf (for FPM), but this means the services do not really have a choice about their PHP version.  It has been made before the application's entry point is reached.

Friday, February 5, 2021

What matters? Experimenting with Ubuntu Desktop VM boot time

It seemed slow to me that the "minimal" installation of Ubuntu Desktop 20.10 in a VirtualBox VM takes 37 seconds to boot up, so I ran some experiments.

The host system is a Late 2012 iMac running macOS Catalina.  It has 24 GB of RAM (2x4 GB + 2x8 GB) installed, and the CPU is the Intel Core i5-3470 (Ivy Bridge, 4 cores, 4 threads, 3.2 GHz base and 3.6 GHz max turbo.)  In all cases, the guests are running with VirtualBox Ubuntu Linux 64-bit defaults, which is to say, one ACHI SATA drive that is dynamically sized.  The backing store is on an APFS-formatted Fusion Drive.

The basic "37 seconds" number was timed with a stopwatch, from the point where the Oracle VM screen is replaced with black, to the point where the wallpaper is displayed.  Auto-login is enabled for consistency in this test.  This number should be considered ±1 second, since I took few samples of each system.  I'm looking for huge improvements.

So what are the results?

  • Baseline: 1 core, ext4, linux-generic, elevator=mq-deadline.  37 seconds.
  • XFS: 1 core, xfs root, linux-generic, elevator=mq-deadline.  37 seconds.
  • linux-virtual: 1 core, xfs root, linux-virtual, elevator=mq-deadline. 37 seconds.
  • No-op scheduler: 1 core, xfs root, linux-virtual, elevator=noop. 37 seconds.
  • Quad core: 4 cores, xfs root, linux-virtual, elevator=noop. 27 seconds!
The GUI status icons showed mostly disk access, with some heavy CPU usage for the last part of the boot, but only adding CPU resources made an appreciable difference.  I also tried the linux-kvm kernel, but it doesn't display any graphics or even any text on the console, so that was a complete failure.

(I've tried optimizing boot times in the past with systemd-analyze critical-chain and systemd-analyze blame, but it mostly tells me what random thing was starved for resources that boot, instead of consistently pointing to one thing causing a lot of delay.  Also, it has a habit of saying "boot complete in seven seconds" or whatever as soon as it has decided to start GDM, leaving a lot of time unaccounted for.  So I didn't bother on these tests.)

Correction: it was Ubuntu 20.10, not 20.04 LTS.  The post has been updated accordingly.

Wednesday, February 3, 2021

Stability vs Churn Culture

I’m working on rewriting memcache-dynamo in Go.  Why?  What was wrong with Python?

The problem is that the development community has diverged from my goals.  I’m looking for a language that’s stable, since this isn’t the company’s primary language.  (memcache-dynamo is utility code.) I want to write the code and then forget about it, more or less.  Python has made that impossible.

A 1-year release cycle with “deprecated for 1 cycle before removal” as the policy means it’s possible for users on Ubuntu LTS to end up in a situation where their previous LTS, 2 versions behind, doesn’t provide an alternative for something that’s been removed in the next LTS / current Python.

But looking closer to home, it’s a trend that’s sweeping the industry as a whole, and fracturing communities in its wake.  Perl wants to do the same thing, if “Perl 7” ever lands.

Also, PHP 5.6 has been unsupported for two years, yet there are still code bases out there that support PHP 5, and people are (presumably) still running 5.6 in production.  With Enterprise Linux distributions, we will see this continue for years; RHEL 7 shipped PHP 5.4, with support through June 2024.

There’s a separate group of people that are moving ahead with the yearly updates.  PHPUnit for example; every year, a new major version comes out, dropping support for the now-unsupported PHP versions, and whatever PHPUnit functionality has been arbitrarily renamed in the meantime.  The people writing for 5.x are still using PHPUnit 4 or 5, which don’t support 8.0; it’s not until PHPUnit 8.5.12 that installation on PHP 8 is allowed, and that still doesn’t support PHP 7.0 or 7.1.

This is creating two ecosystems, and it’s putting pressure on the projects that value stability to stop doing that.  People will make a pull request, and write, “but actually, can you drop php 5 support? i had to work extra because of that.”

The instability of Linux libraries, from glibc on up, made building for Linux excessively complex, and AFAICT few people bother.  Go decided to skip glibc by default when building binaries, apparently because that was the easier path?

Now everyone thinks we should repeat that mistake in most of our other languages.

Friday, January 1, 2021

Daisywriter

Growing up, my dad had an old daisy wheel printer hooked up to our computers.  (We had various Commodore hardware; we did not get a PC until 1996, and I'm not sure the Amiga 500 was really decommissioned until the early 2000s.)

The daisy wheel was like a typewriter on a circle.  There were cast metal heads with the letters, numbers, and punctuation on them.  The wheel was spun to position, and then what was essentially an electric hammer punched the letter against the ribbon and the paper.  Then the whole wheel/ribbon/hammer carriage moved to the next position, and the cycle repeated.

This was loud; like a typewriter, amplified by the casing and low-frequency vibrations carried through the furniture.  No printing could be done if anyone in the house had gone to bed, or was napping.

It was also huge. It could have printed on legal paper in landscape mode.

Because of the mechanical similarity to typewriters, the actual print output looked like it was really typewritten.  Teachers would compliment me for my effort on that, and I'd say, "Oh, my dad has a printer that does that."

Nowadays, people send a command language like PostScript, PCL, or even actual PDF to the printer, and it draws on the page.  Everything is graphics; text becomes curves.  But the Daisywriter simply took ASCII in from the parallel port, and printed it out.

Wednesday, December 16, 2020

Containers over systemd

“Systemd will solve all your problems,” they said.

Having used a number of systemd’s security features to configure a service, I am beginning to suspect everyone uses containers because container runtimes are trying to be secure already.

It's possible to improve the security of a service with systemd, of course.  I’ve worked hard at it.  But in the end, over half the *.service file is consumed with “trying to build my own container out of systemd directives.”  ProtectHome, ProtectSystem, ProtectKernelTunables, Protect This, Protect That, blah blah blah.  The process starts from insecure by default, and then asks me to layer on every individual protection.  This is exactly the sort of thing Linux zealots used to yell at Microsoft about.  ¯\_(ツ)_/¯

But I digress.  I ended up with an excessively long systemd service configuration file, and to apply that to any other service, there’s no option besides copying and pasting those directives.  With every release of systemd, I have to comb the man pages again to see what else is available now, and carefully apply that to every service file.  It’s not easy to tell whether the security posture is up-to-date when the policy is so verbose.

[Updated 2022-03-19: systemd-analyze security foo.service is your friend. This is the best way to get a list of everything systemd thinks about security, and whether it is applied to the unit. It's a little less bad than I thought, but it is still fundamentally the Default Permit and Enumerating Badness approaches.]

Whereas a container has an isolated filesystem (its image) already, so whole classes of configuration (ProtectHome, ProtectSystem, TemporaryFileSystem) become irrelevant.  On top of that, container runtimes start with a more limited set of privileges by default, instead of handing out CAP_SYS_ADMIN and leaving it up to the administrator to carefully disable it.  Escaping from the container runtime is considered a vulnerability; escaping from a poorly-secured systemd service is considered user error.

This is all orthogonal to “containers are interop”, but I think both forces are feeding the containerization craze.  I’m left with the feeling again that systemd should have been the “obvious correct choice,” except they decided usability didn’t matter.

Sunday, December 13, 2020

The world is turning upside-down (2006)

Editor's Note: this is a post from my old Serendipity weblog, "Pawprints of the Mind," which was originally posted nearly 14 years ago, on 2006-12-29. Since then, I can't help but notice that social media created itself on a Pull model—follow who you want—and then replaced it with algorithms to Push into that stream. The text below is reproduced verbatim from the original post.

In the beginning, Push dominated.

A company built a product, and it was Pushed to market. Long ago, newspapers pushed by paper boys on the street carried advertisements. Direct-mail catalogs, pushed through the postal service, were nothing but one long and comprehensive advertisement for the company who created it. The rise of radio and television, both one-way broadcast media, allowed advertisements to be pushed to millions at a time, quickly and easily. Movies and music are produced and pushed, and the producers hope they break even.

After a company pushed their product, they spent plenty of money watching what happened to the sales. Was it going to explode or tank? Was the initial 10,000 piece production run going to be liquidated, or was another run ten times the size waiting in the wings? There was no way to sense demand except to Push supply and watch what happened.

Inevitably, Push was brought to the new media: the Internet. Building on the ideas of direct mail, email lists formed. Spam happened. Spam filters happened. Better spam filters happened. But those may be only a temporary solution.

As Pushing got cheaper, more was pushed, until users fled the deluge. Too much yang leaves people wanting yin. Onto this stage stepped Pull.

Nobody knew it was Pull yet. It called itself RSS or reddit, and it was about users coming and getting it. No more HTML-heavy, graphic-packed email stuffed into their Inbox every other day to make up for the poor quality of the site's search capabilities. No more "Insert-Brand-Here Loyalty Updates". No more subscriptions, passwords, bounce processing, and unsubscribing. No more spam, because the provider no longer needs to know where to send anything; they just wait for users to come and get it.

Pull is about user control. Pull is about saying "I want that" and not having some gatekeeper in the way, trying to extract monopoly rents. This is what scares the recording industry; their value as gatekeepers is plunging as alternative ways of connecting bands and fans arise.

Sunday, December 6, 2020

Can Micropayments Even Work? (2007)

Editor's Note: this is a post from my old Serendipity weblog, "Pawprints of the Mind," which was originally posted over 13 years ago, on 2007-05-13. The text below is reproduced verbatim from the original post.

(This is not entirely academic, as a current goal of my day job essentially amounts to implementing a micropayment system.)

I am beginning to believe that the fundamental problem behind micropayments as a viable option for widespread payment is that credit cards are effectively already micropayments. We're just spoiled by cash. Physical currency is limited by reality. There's not a limitless supply to steal, nor can it be readily created. Undetectable counterfeits cannot be manufactured by poking a few bits inside a computer. Rather, it's difficult to produce high-enough quality counterfeits, which is why good counterfeits only come in $100 notes. The run-of-the-mill counterfeiters are stuck trying to figure out how to make a passable $20 out of card stock or ordinary paper, because anything bigger is subject to too much scrutiny for their materials. (Even then, a local supermarket tests all those, pushing the bar down to $10.)

In essence, I suspect the cost of doing business with a credit card company is mostly the cost of implementing imaginary money securely. The more credit processing costs, the fewer shops join in, and the less of the currency marketshare the creditor ends up with. On the other hand, the services can't be priced so low as to be unprofitable. Not to mention, the more that people use their cards, the more interest the creditor can collect at little extra cost, as all the billing and accounting framework was already in place for that cardholder anyway. Charging less makes economic sense for them, even if they were a monopoly.

A credit card company ends up shaving a few percent off transactions made through them. Micropayments want to be the same thing, only smaller: shave a few percent off penny-sized transactions and make up for it by volume. But the micropayment competes with the credit gateways, if one of the main ways of getting money into the system is to purchase microcurrency on a credit card. Inside the system, the shaving has to be high enough to make up for the transaction cost of the microcurrency being bought and sold, as well as the real costs of doing the transaction and turning a profit.

And if micropayments are essentially equivalent to real currency, then they're also equivalently desirable for fraud, stealing, and counterfeiting: something the large creditors are spending plenty of money on for the best and brightest to counteract. This brings up another point: micropayments probably won't have the same amount of consumer trust as credit cards, because personal liability is legally limited to $50 on the cards. This is not the case for micropayments, which is going to make people not want to have too many of them at one time. That in turn limits the total amount of microcurrency that can be circulated, and restricts the market for higher-priced microsales.

Is it possible to best Visa and MasterCard at their own game?