Friday, March 30, 2018

Commit Logs vs. External Systems

There’s a couple of schools of thought on commit logs: should you make them detailed, or should they be little more than a pointer to some sort of external pull request, ticketing, or ALM system?

When people have external systems, the basic problem they face is that those systems seem to be redundant with the information in the commit log.  Why write the same thing twice, when the commit log can just point to the ticket, which undoubtedly has a lot of rationale and reasoning already written into it?

But in my experience, it’s far more useful to have good commit log messages. When hunting in the code for a problem, the commit log is right there and available to tools like git log --grep, which can also print the diff alongside the log messages.

And while some tools like GitHub pull requests offer line-by-line commentary on diffs, the other interesting thing about commit logs is that they’ve proven much more resilient over time.  Our ticket tracking has migrated from my boss’s inbox, to Lighthouse, to a bespoke ticketing system… that we integrated into our support team’s workflow as well, which has become something we want to split back out.  And then we might replace the “remaining half” of the bespoke system with some off-the-shelf solution.

Meanwhile, our commit logs have been preserved, even across a move from subversion to git, so those go back to the point in time where the founders realized they should set it up.  But the references to “Lighthouse” have been broken for years, and if we succeed in killing a huge pile of custom code nobody wants to maintain, all those “ticket #16384” references are also going to be useless.

But the commit messages will live on, in a way that ticketing systems have not.  And so, I don’t really trust that we’d stick to GitHub and have their issue and pull request systems available for every change, forever.

Aside from that, I think a succinct summary of the ticket makes a good commit message.  I try to avoid repeating all the ticket babble, status updates, and dead ends in the commit:

Contoso requested that their cancellations be calculated on a 90-day cliff instead of days remaining in term.  Add cancel_type=cliff and cancel_cliff_days=90 to the client settings.  Ticket 18444.

This gives the big-picture outlook on what happened and why, and lets the diff speak for itself on whether the change suits the intention.  If there are questions about whether the true intention was understood, then the ticket is still linked, so it can be examined in further detail.

Tuesday, March 20, 2018

Supposedly Readable Code

There are two hard problems in Computer Science: cache invalidation, and naming things. —Phil Karlton

The problem with long-term evolution of a codebase is that the compatibility requirements end up creating constraints on the design.  Constraints that may be felt for a decade, depending on how history unfolds.

What my company now refers to as “Reserve,” which seems to be fairly universally understood by our clients and b2b partner soup, was initially called “Cost.” That was replaced by “Escrow” because the “Fee” is also a cost, just a different kind.  But escrow didn’t sit right among people who make deals and sign contracts all day, because it wasn’t necessarily being held by a third party.  (Depending on what kind of hash the salesmen made of it, it was held by either the first or second party.)

The point is, before coming up with a universally acceptable term, we needed some term, so Cost and Escrow got baked into the code and database structure to a certain extent.  Along with Reserve.

When someone new comes along, their first instinct is to complain about how “confusing” it is.  And I can see that.  It’s a single concept going by three names.

You get used to it, though.  As you work with it repeatedly, the concept gets compressed in your brain.  Here it’s Cost, there it’s Reserve, it’s the same thing in both places.

But, getting used to it is a symptom of the “ignoring weak signals” problem.  (Is there a better name for that?  “Normalization of deviance” is heavy, too.) If we hired enough people, it would be a clear source of suckage that we’d really want to fix.

On the other hand, I’d love to do a cost-benefit analysis and find out just how important it really is to get fixed.  Unfortunately, that depends on measuring the “loss of productivity” from the multiple names, and measuring productivity to begin with is difficult.  I think the experimental design would also require fixing the problem to get a decent measurement on the single-name productivity.

Therefore, it ends up easy to ignore weak signals because they’re weak, and we don’t know what we’re missing by doing so.

Another justification for ignoring them is that we can’t act on them all.  We have to prioritize.  After all, developers tend to be disagreeable.  I know—whenever I’m making a “quick bugfix” in some code I don’t own, I have to suppress strong urges to “fix” all the things, from naming to consistency to style conventions to getting rid of the variable in $x = thing(); return $x;.  I’m pretty sure the rest of the team does the same for my code.

The funny thing is, I bet each one of us on the team thinks we write the most readable code.  I’ve been doing this longer than anyone, and I put a lot of effort into it.  I standardized my table alias names, and I wish everyone else followed that, because the code was a lot easier for me to read when “clients” was just “C” and not a mix of “C”, “c”, “cl”, or “cli” depending on which SQL statement one happens to be reading.

Between synonyms and the irritating slipperiness of natural language, then—is there such a thing as “readable code?”  There’s certainly code that’s been deliberately obfuscated, but barring that: can we measure code readability? Or is it just doomed to be, “my latest code is best code,” forever?

Saturday, March 10, 2018

Linux’s Hazing Ritual

What happens the instant a user wants to install some non-graphical software on their system?  What if they want to tweak something that happens to be considered an “advanced” configuration setting?

All the advice for these situations begins, “Go to a terminal, and type sudo ...

Whatever is being solved is a common enough problem that there’s advice for it all over the internet, but it’s not common enough that anyone can build a graphical interface for it?

I think about this every time I have to install the VirtualBox Guest Additions.  The GUI package managers have shifted from showing all software on the system to only showing “graphical” software, insisting there are no results for packages like build-essential.  The package is there, it’s just being “helpfully” hidden from me, even when asking by its exact name.

So instead, I am expected to go off to the command line, and either paste a string from a web page, or carefully type it out in its entirety, with exact spelling.

The idea of exploiting web pages comes up from time to time—styling allows additional text to copy but not display—but the general response from the community is “Don’t do that, then.”  Paste it into vim if you’re not sure? How does a user even know to do that in the first place? Can anyone, including expert users, ever be “sure” about it?

I understand why directions are given as terminal commands.  They’re easy to write, and if the copy/paste mechanism is working and trustworthy, they’re efficient to execute that way.

But the underlying cultural problem—that everyone must be able to use the command line as a basic prerequisite to managing their system—is an obstacle to creating a more user-friendly design.  That in turn becomes an obstacle to getting widespread usage.

People tell themselves it doesn’t matter, but consider how hard Microsoft worked on getting “a tablet PC” going for a decade before Apple showed up with the iPad.  The tech circles predicted a quick death, just like they did for the iPod and perhaps the iPhone, but it defined the segment instead.

What was the difference?  The iPad didn’t require stabbing at mouse-sized bits with a delicate stylus.  It was designed around the finger, with no fallback, producing internal pressure to make sure the finger interface worked, worked well, and worked for everyone.

In Linux, the terminal exists by default and remains the only way to do some things, making it inescapable.