Thursday, November 27, 2014

Versioning

Dan Tao asks, “In all seriousness, what’s the point of the MINOR version in semver? If the goal is dependency stability it should just be MAJOR and PATCH.”

I think I finally remember the answer. It’s set up to flag no compatibility, source compatibility, and binary compatibility. The “thing” that dictates bumping the minor shouldn’t be “features” so much as binary-incompatible changes. For example, GTK 2: if code was written against 2.4.0 using the shiny new file chooser, it would still compile against 2.12. But, once it had been compiled against 2.12, the resulting binary wouldn’t run if 2.4 was in the library path. A binary compiled against 2.4.2 would run on 2.4.0, though, because every 2.4.x release was strictly ABI compatible.

IIRC, they had a policy of forward compatibility for their ABI, so that a binary compiled against 2.4 would run on 2.12, but I don’t remember if that’s actually necessary for SemVer. Another way to look at this is, “If we bump the minor, you’ll need to recompile your own software.” Where that could be programs using a library, or plugins for a program.

I believe that’s the motivation for SemVer to include minor, but upon reflection, it doesn’t really make sense in a non-binary world. If there is no binary with baked-in assumptions about its environment, then that layer isn’t meaningful.

Also upon reflection, most of my currently-tracked projects don’t use SemVer. The AWS SDK for PHP is (in 1.x/2.x) organized as “paradigm.major.minor”, where 2.7 indicates a breaking change vs. 2.6 (DynamoDB got a new data model) but e.g. 2.6.2 added support for loading credentials from ~/.aws/credentials. PHP itself has done things like add charset to the PDO MySQL DSN in 5.3.6. When PHP added the DateTime class, it wasn’t a compatible change, but it didn’t kick the version to 6.0.0. (They were going to add it as Date, but many, many people had classes named that in the wild. They changed to DateTime so there would be less, but not zero, breakage.)

So I’ve actually come to like the AWS 2.x line, where the paradigm represents a major update to fundamental dependencies (like straight cURL and calling new on global classes to Guzzle 3, namespaces, and factory methods) and the major/minor conveys actual, useful levels of information. It makes me a bit disappointed to know they’re switching to SemVer for 3.x, now that I’ve come to understand their existing versioning scheme. If they follow on exactly as before, we’ll have SDK 4 before we know it, and the patch level is probably going to be useless.

I think for systems level code, SemVer is a useful goal to strive for. But the meta point is that a project’s version should always be useful; if minor doesn’t make sense in a language where the engine parses all the source every time it starts, then maybe that level should be dropped.

At the same time, the people that SemVer might be most helpful for don’t really use it. It doesn’t matter that libcool 1.3.18 is binary compatible with libcool 1.3.12 that shipped with the distro, because the average distro (as defined by popular usage) won’t ship the newer libcool; they’ll backport security patches that affect their active platform/configuration. Even if that means they have effectively published 1.3.18, it’ll still be named something like 1.3.12-4ubuntu3.3 in the package manager. Even a high-impact bug fix like “makes pressure sensitivity work again in all KDE/Qt apps” won’t get backported.

Distros don’t roll new updates or releases based on versions, they snapshot the ecosystem as a whole and then smash the bugs out of whatever they got. They don’t seem to use versions to fast-track “minor” updates, nor to schedule in major merges.

One last bit of versioning awkwardness, and then I’m done: versions tend to be kind of fuzzy as it is. Although Net::Amazon::DynamoDB’s git repo has some heavy updates (notably, blob type support) since its last CPAN release, the repo and the CPAN release have the same version number stored in them. When considering development packages in an open-source world, “a version” becomes a whole list of possible builds, all carrying that version number, and each potentially subject to local changes.

Given that, there’s little hope for a One True Versioning scheme, even if everyone would follow it when it was done. I suspect there’s some popularity around SemVer simply because it’s there, and it’s not so obviously broken/inappropriate that developers reject it at first glance. It probably helps that three-part versions are quite common, from the Linux kernel all the way up to packages for interpreted languages (gems or equivalents).

Wednesday, November 26, 2014

AppArmor and the problems of LSM

AppArmor is a pretty excellent framework.  It clearly works according to its design.  There are only two major problems, one of which is: apps can't give out useful error messages when an AppArmor check fails.

Since LSMs insert themselves inside the kernel, the only thing they can really do if an access check fails is force the system call to return EPERM, "permission denied."  An app can't actually tell whether it has been denied access because of filesystem permissions, LSM interference, or the phase of the moon, because the return code can't carry any metadata like that.  Besides which, it's considered bad practice to give an attacker detail about a failing security check, thus the ubiquitous and uninformative "We emailed you a password reset link if you gave us a valid email" message in password reset flows.
Thus, the hapless administrator does something perfectly reasonable, AppArmor denies it, and it becomes an adventure to find the real error message.  AppArmor tends to hide its messages away in a special log file, so that the normal channels and the app's log (if different) only show a useless EPERM message for a file that the app would seem to have access to upon inspection of the filesystem.

Adding more trickiness, AppArmor itself doesn't always apply to an application, so testing a command from a root shell will generally work without issue.  Those root shells are typically "unconfined" and don't apply the profiles to commands run from them.

The other main problem is that it requires profile developers to be near-omniscient.  It's nice that tcpdump has a generic mechanism to run a command on log rotation, but the default AppArmor profile only sets up access for gzip or bzip2... and even then, if it's a *.pcap file outside of $HOME, it can't be compressed because the AppArmor profile doesn't support creating a file with the compressed extension.  (That can be fixed.)

It's nice that charon (part of strongswan) has a mechanism to include /etc/ipsec.*.secrets so that I could store everyone's password in /etc/ipsec.$USER.secrets ... but the profile doesn't let charon list what those files are even though it grants access to the files in question.  So using the include command straight out of the example/documentation will, by default, allow the strongswan service to start... but it won't be able to handle any connections.

I had SELinux issues in the past (which share all the drawbacks of AppArmor since it's just another LSM) when I put an Apache DocumentRoot outside of /var/www.  In that case, though, I disabled SELinux entirely instead of trying to learn about it.

tl;dr: it's pretty much a recipe for Guide Dang It.

Friday, November 21, 2014

Add permissions to an AppArmor profile

Background: I tested something in the shell as root, then deployed it as an upstart script, and it failed.  Checking the logs told me that AppArmor had denied something perfectly reasonable, because the developers of the profile didn't cover my use case.

Fortunately, AppArmor comes with a local override facility.  Let's use it to fix this error (which may be in /var/log/syslog or /var/log/audit.log depending on how the system is set up):
kernel: [ 7226.358630] type=1400 audit(1416403573.247:17): apparmor="DENIED" operation="mknod" profile="/usr/sbin/tcpdump" name="/var/log/tcpdump/memcache-20141119-072608.pcap.gz" pid=2438 comm="gzip" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
What happened?  requested_mask="c" means the command (comm="gzip") tried to create a file (given by the name value) and AppArmor denied it.  The profile tells us, indirectly, which AppArmor file caused the denial; substitute the first '/' with the AppArmor config dir (/etc/apparmor.d/) and the rest with dots.  Thus, if we look in /etc/apparmor.d/usr.sbin.tcpdump, we find the system policy.  It has granted access to *.pcap case-insensitively anywhere in the system, with the rule /**.[pP][cC][aA][pP] rw.  I used that rule to choose my filename outside of $HOME, but now the -z /bin/gzip parameter I used isn't able to do its work because it can't create the *.gz version there.

Incidentally, the command differs from the profile in this case, because tcpdump executed gzip.  It's allowed to do that by the system profile, which uses ix permissions—that stands for inherit-execute, and means that the gzip command is run under the current AppArmor profile.  All the permissions defined for tcpdump continue to affect the gzip command.

Anyway, back to that fix I promised.  AppArmor provides /etc/apparmor.d/local/ for rules to add to the main ones.  (Although this can't be used to override an explicit deny like tcpdump's ban on using files in $HOME/bin.)  We just need to add a rule for the *.gz, and while we're there, why not the *.bz2 version as well?
/**.[pP][cC][aA][pP].[gG][zZ] rw,
/**.[pP][cC][aA][pP].[bB][zZ]2 rw,
The trailing comma does not seem to be an issue for me.  Note also that we don't need to specify the binary and braces, since the #include line in the system profile is already inside the braces.

Ubuntu ships some files in the local directory already; we should be able to run sudo -e /etc/apparmor.d/local/usr.sbin.tcpdump and add the lines above to the existing file.  Once the file is ready, we need to reload that profile to the kernel.  Note that we use the system profile here, not the one we just edited:
sudo apparmor_parser -r /etc/apparmor.d/usr.sbin.tcpdump
I'm not clear enough on how AppArmor works to know if we need to restart the service now (I'm purposely leaving my tcpdump service file/discussion out of this post) because I restarted mine just to be safe.

Wednesday, November 19, 2014

Unison fatal error: lost connection to server

My unison sync started dying this week with an abrupt loss of connection.  It turns out, when I scrolled back up past the status spam, that there was an additional error message: Uncaught exception Failure("input_value: bad bigarray kind")

This turns out to be caused by a change of serializer between ocaml 4.01 and 4.02.  Unison compiled with 4.02 can't talk to pre-4.02 unisons successfully, and the reason mine failed this week is because I did some reinstalls following an upgrade to OS X Yosemite.  So I had homebrew's unison compiled with 4.02+ and Ubuntu's unison on my server, compiled with 4.01.

Even if I updated the server to Ubuntu utopic unicorn, it wouldn't solve the problem.  ocaml 4.02 didn't make it in because it wasn't released soon enough.

So, time to build unison from source against ocaml built from source!

One problem: ocaml 4.01 doesn't build against clang... and on Yosemite, that's all there is.  I ended up hacking the ./configure script to delete all occurrences of the troublesome -fno-defer-pop flag, then redoing the install.  Per the linked bug, it was added in the 01990s to avoid tickling a code generation bug in gcc-1.x.  Newer clangs won't complain about this particular flag, and newer ocamls won't use it, but I chose to stick with the old (semi-current) versions of  both.

Happily, that seemed to be the only problem, and the rest of the unison build (namely, the text UI; I didn't want a heap of dependencies) went fine.

(And this, friends, is why you should define a real file format, instead of just dumping your internal state out through your native serializer.  For things you control, you can have much broader cross-compatibility.)

Wednesday, November 12, 2014

Web project repository layout

I used to stuff all my web files under the root of the project repository when I was developing "a web site", and include lots of tricky protections (is random constant defined and equal known value?) to 'protect' my includes from being directly accessed over the web.  I was inspired at the time by a lot of open-source projects that seemed to work this way, including my blog software at the time (serendipity); when one downloaded the project, all their files were meant to be FTP'd to your server's document root.

However, these days I'm mostly developing for servers my company or I administer, so I don't need to make the assumption that there is "no place outside the DocumentRoot."  Instead, I designate the top level of the repo as a free-for-all, and specify one folder within as the DocumentRoot.

This means that all sorts of housekeeping can go on up in the repo root, like having composer files and its vendor tree.  Like having the .git folder and files safely tucked away.  Like having a build script that pre-compresses a bunch of JavaScript/CSS... or a Makefile.  Or an extra document root with a static "Closed for maintenance" page.  The possibilities are almost endless.

Some of our sites rely on RewriteRules, and that configuration also lands outside the DocumentRoot, to be included from the system Apache configuration.  This lets us perform rewrites at the more-efficient server/URL level instead of filesystem/directory level, while allowing all the code to be updated in the repo.  When we change rewriting rules, that goes right into the history with everything else.

To give a concrete example, a website could look like this on the server:
  • public/
    • index.php
    • login.php
    • pdf-builder.php
    • pdf-loader.php
    • css/
    • js/
    • img/
  • tcpdf/
    • (the TCPDF library code)
  • built-pdf/
    • 20141011/
      • 1f0acb7623a40cfa.pdf
  • cron/
    • expire-built-pdfs.php
  • conf/
    • rewrite.conf
  • composer.lock
  • composer.json
  • vendor/
    • aws/
      • aws-sdk-php/
    • guzzle/
    • ...
In this case, DocumentRoot is the public folder, it uses TCPDF to generate private PDFs into built-pdf and lets the user download them, doing access control in pdf-loader.php.  It divides the built pdfs up by day so they can be expired conveniently by the cron script, it has some rewrite rules, and it uses Composer.  (Disclaimer: I pretty much made up all the details here, but the theory is broadly applicable.)

Again, this doesn't really work for letting other people publish on a shared host (rewrite.conf could conceivably do anything allowable in VirtualHost context, not just rewrite) but we own the whole host.