Tuesday, February 23, 2016

PGP

For the first time in over fifteen years of awareness about PGP, I met someone who actually wanted to use it.  I got to set trust on a key and see this awesome menu:
Please decide how far you trust this user to correctly verify other users' keys (by looking at passports, checking fingerprints from different sources, etc.)

1 = I don't know or won't say
2 = I do NOT trust
3 = I trust marginally
4 = I trust fully
5 = I trust ultimately

This reveals a lot about the assumptions of PGP and the problem it was trying to solve...

The menu clearly focuses on real-world identities, trying to get users to establish ‘trust’ that people correspond to the cyber-space identities. (Those digital identities are the keys: anyone with the key is indistinguishable from anyone else who has it.) Why else is the focus on “verifying” by looking at passports and fingerprints?

In short, PGP was the first Google+: built by nerds as an identity service for the masses… that failed to become mainstream.

Thursday, February 18, 2016

Unikernels

I learned a new term recently: unikernel.  Wikipedia defines it as a “single address space machine image” or else uses the “library operating system” term, but the gist of it is:

Unikernel: A system where a single application runs in unprotected mode.

Instead of a “normal” operating system managing multiple processes, a unikernel has the OS services arranged as a library, and runs one process with that library in the kernel space of the host hardware.

This used to be fairly impractical, as you’d have to dedicate physical hardware to the unikernel, and it would still need drivers for that hardware. It was a lot of work and not very portable.  But with virtualization, a unikernel can build to the hypervisors’ interfaces, then run in any environment that supports the hypervisor.

Here’s the difference from other approaches once virtualization gets in the mix: a unikernel uses the hypervisor only to provide isolation and virtio devices, then runs one process with no further hardware protections inside the guest.  Networking, filesystems, and anything else normally provided by the OS through the syscall interface are, instead, built as function calls living in the same address space with the main application.

A container uses a host OS kernel in much the same way, to run each container as a host process.  While this requires a specific syscall interface, it also means the container environment can take advantage of the host OS’s drivers, filesystems, and networking.  The container and hypervisor are fully unaware of each other, which also enables them to run on a host OS that’s running on bare metal.

A traditional guest system, of course, runs a full OS inside the guest and ordinary processes inside the OS.  Like the unikernel, it boots anywhere with the requisite CPU/virtio support, regardless of underlying host OS, but like a container, it continues to provide OS services like networking, filesystems, and process management through the syscall interface.

(Plenty of other ink has been spilled about the security of all this, but the tl;dr is, hypervisors more-or-less secure multi-tenant hardware.  Running in the cloud means being virtualized, even with containers.)

Here’s a rough drawing of two virtual-machine architectures (KVM and Xen), and how they differ from containers and unikernels:



Obviously, unikernels are also a bigger change in application architecture from containers (a single package and its dependencies, minus a kernel) or a full OS, but they’re well-suited for single-language microservices.  Once a language has the “OS library” to provide networking and filesystems (if needed), any app written to that library can be compiled to a unikernel.

The unikernel, essentially, gives up composability in the pursuit of speed. Additional process can’t be added in, because the unikernel doesn’t support processes or process isolation without losing its essential difference from an ordinary OS or container.  (Threads are possible, but not processes.)

Perhaps, again, the difference is best described with a picture:


In the unikernel, what are ordinarily “operating system services” and shared library calls are compiled into a single address space as ordinary function calls.  Getting rid of the user/system split is of course their whole point, and shared libraries also disappear because, as one process, there is no “other” process to share with.  All the mechanics involved in doing so become unnecessary.

As noted above, hypervisors have made unikernels much more interesting and feasible lately.  There’s a full list at unikernel.org, particularly their projects page, but the two that look most interesting to me are:

I don’t know when—or even if—I’ll get around to doing anything with them, but those are likely to be the easiest-to-use unikernels, unless someone makes node.js into one.

Sunday, February 14, 2016

API Gateway as an HTTP Service Proxy: Lessons Learned

At work, we’re finishing the implementation of our first API using the Amazon API Gateway for decoupling the API key management, logging, and throttling from the actual backend service. This also marks our first OAuth 2.0 Resource Server, and makes heavier use of Swagger 2.0 for the entire pipeline.  With all these “firsts,” I’d like to share a few notes on our setup.