Thursday, June 21, 2018

Version Locking

We have a few lovely workarounds, arrived at through a small measure of blood, sweat, tears, and research, embedded in our cpanfile these days.  Here are my two favorites:

requires 'Email::MIME', '<= 1.928';
    # 1.929+ screws up Content-Type sometimes
requires 'DateTime::Format::Strptime', '<= 1.57';
    # DynamoDB v0.1.16 (current) buggy with rewritten Strptime

Nobody wants to come back and check these, to see if later releases of the packages work again.  (It’s possible I’m the only one who remembers these hacks are here.)

Email::MIME was especially nasty because it only failed sometimes, and now, how do I prove that the fixes (which may have occurred in 1.931 or 1.933) actually solve the problem?  I can’t prove the negative in test, and I can’t trust the package in production.

As for the other fun bug, “DynamoDB v0.1.16” refers to the package whose full name is Net::Amazon::DynamoDB and released on November 6th, 2012.  I think this one was detectable at install/test time, but it was still No Fun.  A lot of work was expended in finding out its dependency changed, and I’m not excited to redo it all to find out if 1.67 (or 1.63) fixed the issues.

Especially since use of Perl was deprecated company-wide, and we want to get it all ported to a new language.

Editor’s Notes: this was another old draft.

Since it was written, we accidentally introduced an updated version of Email::MIME into production, and it still failed.  We fixed the bug that allowed the update to occur; clearly, upstream’s process is broken.  I don’t think we could be the only people hit, across multiple years and multiple versions of the entire software stack.

I’m not entirely sure what happened with Net::Amazon::DynamoDB—but we may have been able to use a newer version with it.

Thursday, June 14, 2018

Python, virtualenv, pipenv

I heard (via LWN) about some discussion about Python and virtualenvs.  I'm bad at compressing thoughts enough to both fit Twitter and make sense at the same time, so I want to cover a bit about my recent experiences here.

I'm writing some in-house software (a new version of memcache-dynamo, targeting Python 3.6+ instead of Perl) and I would like to deploy this as, essentially, a tarball.  I want to build in advance and publish an artifact with a minimum amount of surrounding scripts at deploy time.

The thing is, the Python community seems to have drifted away from being able to run software without a complex installation system that involves running arbitrary Python code.  I can see the value in tools like tox and pipenv—for people who want to distribute code to others.  But that's not what I want to do; I want to distribute pre-built code to myself, and as such, "execute from source" has always been my approach.

[Update 2018-09-06: I published another post with further thoughts on this problem.]

Thursday, June 7, 2018

Stream Programming, Without Streams

Editor’s note: Following is a brief draft from two years ago.  I’m cleaning out the backlog and faking HUGE POST COUNTS!! for 2018, I guess.

I wrote before that I don't “use” stream programming, but I've come to realize that it's still how I think of algorithms and for loops.  Input enters at the top, gets transformed in the middle, and yields output at the bottom.

It's like I look at a foreach loop as an in-line lambda function.  The concept may not be explicitly named, and the composition of steps built into lower-level control flow… but inside my head, it's still a “sequence of operations” on a “stream of data.”

There doesn't seem to be much benefit to building up “real” streams in a language that doesn't have them either built-in or in the standard library. It creates another layer where the things that have been built can only interoperate with themselves, and a series of transformations can no longer share state.  And, a PHP array can be passed to any array_* function, where (last I even checked) our handmade streams or Iterators cannot.