Decoded Node: What If: Weak Memory Pages

Raymond Chen wrote about the "what if everybody did this?" problem of applications written to consume up to some threshold of memory and free some of it under pressure: if multiple applications have different thresholds that they're trying to maintain, then the one with the smallest-free threshold wins. Of course, the extreme of this is a normal application that doesn't try to do anything fancy, which acts like it has a negative-infinity threshold. If it never adjusts its allocations in response to free memory, then it always wins.

Some of the solutions batted around in the comment thread involve using mmap() or other tricks to try to get the OS to manage the cache, but this brings up its own problems.

The obvious problem with mmap is that, under pressure, the OS will still treat the mapped data as precious, and write it back to the file. This is almost indistinguishable from swapping normal memory, but since we're using mmap(), this is a separate file from the actual swap partition. That means it may live on top of all sorts of other layers: the filesystem itself, a volume manager, software RAID, a transparently encrypted and/or compressed block device, and/or a block device pointing to network-attached storage. The final one is also a roundabout way of having a networked file system, while hiding the traditional signs of NFS from applications. In any case, the network may be invoked by accident. On traditional hard disks, separating the application cache's swap space from the actual swap partition may cost more in disk seeks when swapping actually happens. On a solid-state drive, the user may not actually want the cache swapped to it at all, to prolong the drive's longevity.

So that's a lot of points against mmap, which are all consequences of its swapping, since the application doesn't have a way to tell the OS that "this data isn't that valuable—you can discard dirty pages." What would happen if we added a system call for that, and extended the malloc family to request it?

Well, the first question is, "How do you know when the OS discarded the page?" If you just try to read the memory, you'll most likely generate a segfault, which you'll have to distinguish somehow as a weak-pointer... and then do something about it because the machine can't just keep executing. That's why it generated the fault in the first place! Java can get away with an exception, because the language intrinsically provides them, and the necessary stack unwinding control. C, on the other hand, tries to avoid having a stack visible in the language.

Alternatively, if reading a discarded page always returns 0, then you can't store data there that's legitimately zero. Otherwise, you can't distinguish a discarded page from your legitimate data. Even a struct: if you read individual members into the CPU, then those individual members cannot be zero, or you're opening yourself to weird time-of-check/time-of-use scenarios where the struct validated, but you got invalid data back by the time you tried to read your "real" target.

Not to mention, I'm unsure exactly how that zero read would be implemented--either "mapped-but-zero" has to be supported in the hardware's memory management, or the OS' fault handler has to go disassemble the instruction from the user IP and write a 0 into the appropriate place. Trying to do anything else, like deliver a signal, seems to run into the exact same signal-handling problems: how does the application set itself up to abort when the signal handler returns? Does it have to add checks before every write using data that was read from a weak page? How does anyone keep track of all that?

This also adds concerns to the system's memory manager. It has a new type of page to handle everywhere, along with a discarding policy. If pages can be discarded out of the middle of the discardable space, then discard-fragmentation can happen. Depending on how the application re-allocates weak space, the system may never be able to fill it again.

I feel like I'm writing Have You Ever Legalized Marijuana? but shorter and focused on only one example.

If all of the above are solved, there's still a pathological case where an application that discovers it lost some weak pages, re-populates them (as weak pages), and then tries to use them—only to discover they're gone already because of continuing memory pressure. This same problem occurs with ordinary weak data structures in Java, when memory is low. The application needs to know what data is precious enough to keep in ordinary memory, and what can be moved to a weak area. Moving data is easier in Java, because there aren't unrestricted pointers, so moving an object need not invoke copy constructors or invalidate any pointers.

All this, to drag down the entire performance of every application on the system, in order to provide support for a few apps to not-swap? It's probably not worth the effort.

I briefly considered "what if the OS could send low-memory notifications to applications?" but that still divides apps into two categories: ones that react, and ones that ignore the information. The latter ones will then win all the "excess" memory from the ones who play nicely.

Wednesday, January 25, 2012

What If: Weak Memory Pages

No comments: