Sunday, May 12, 2024

Using tarlz with GNU tar

I have an old trick that looks something like:

$ ssh HOST tar cf - DIR | lzip -9c >dir.tar.lz

The goal here is to pull a tar from the server, compressing it locally, to trade bandwidth and client CPU for reduced server CPU usage.  I keep this handy for when I don’t want to disturb a small AWS instance too much.

Since then, I learned about tarlz, which can compress an existing tar archive with lzip.  That seemed like what I wanted, but naïve usage would result in errors:

$ ssh HOST tar cf - DIR | tarlz -z -o dir.tar.lz
tarlz: (stdin): Corrupt or invalid tar header.

It turned out that tarlz only works on archives in POSIX format, and (modern?) GNU tar produces them in GNU format by default.  Pass it the --posix option to make it all work together:

$ ssh HOST tar cf - --posix DIR | \
    tarlz -z -o dir.tar.lz

(Line broken on my blog for readability.)

Bonus tip: it turns out that GNU tar will auto-detect the compression format on read operations these days.  Running tar xf foo.tar.lz will transparently decompress the archive with lzip.

No comments: