Wednesday, August 22, 2012

Think of the Olden Days!

My first Linux machine had 128 MB of RAM.  The bzip2 warnings that you needed at least 4 MB of RAM to decompress any archive seemed obsolete at the time (even our then-4.5-year-old budget computer had shipped with twice that for Windows 95 RTM) and downright comical now that I have 4,096 MB at my disposal.

I was compressing something the other day with xz, which was taking forever, so I opened up top and only one core was under heavy use.  Naturally.  In the man page is a -T<threads> option... that isn't implemented because won't someone think of the memory!

OK, sure.  It appears to be xz -6 based on the resident 93 MB; with four cores, it's still under 10% of RAM.  The only ways it could come close to hurting are to run at xz -9 which consumes 8 times the memory and would seriously undermine the "reasonable speed" goal even with four threads; to run with 44 cores but not more RAM; or to run it on a dual-thread system on 256 MB.  The concern seems to be nearly obsolete already... will we be reading the man page in 2024 and finding that there are no threads because they use memory?

The point of this little rant is this: someone has a bigger, better system than you.  Either one they paid a lot of money for and would like to see a return on investment, or one they got further into the future than yours.  If you tuned everything to work on your system today, left or right shift by 1, then you have a small window of adaptability that will soon be obsolete.  Especially pertinent here is that parallelizing compression does not add requirements to the decompressor.  A single-thread system will unpack just as well, it just takes longer; unlike the choice of per-thread memory which forces the decompressor to allocate enough to handle the compression settings.

(Like gzip and bzip2, there exist some parallel xz utilities.  But only pbzip2 has made it into the repository.)

No comments: