Sunday, November 29, 2020

Discontinuous Complexity (2007)

Editor's Note: this is a post from my old Serendipity weblog, "Pawprints of the Mind," which was originally posted over 13 years ago, on 2007-10-24. Notably, this was written before I knew about jQuery, which improved on Prototype, eclipsed it entirely, and has since fallen out of fashion. The text below is reproduced verbatim from the original post.

When a system gets sufficiently large, changes become surprisingly difficult. It's as if a gremlin gets into the system, and then perfectly reasonable estimates end up being only halfway to the mark. Or less. This annoys managers, who made a bad decision because the estimate turned out bad, and it annoys engineers because they know their estimate was sensible. Why does this happen?

Slowing Down

I think the key factor is that the system exceeds the size of an individual developer's working memory. In the same way that virtual memory is a lot slower than real memory, development slows down considerably when it exceeds the mind. Tracking work on paper is not instantaneous, and the average developers' choice is to just forget stuff instead. Not that anyone can know when they've forgotten something, or else it wouldn't be forgotten.

The problem with the just-forget method is that it makes coding a lot more time-consuming. You end up writing the same thing, several times, each time accounting for a new layer of information which was forgotten, but later rediscovered. After so much work, you think the code must be beautiful and perfect, until you run it. Another layer of forgotten information becomes apparent, but this time, it has to be painstakingly rediscovered through debugging. There could be several more debug cycles, and if you're unlucky, they can trigger another redesign.

Paper is no panacea either; besides its slowness, it seems to be impossible to track all your thoughts, or sort the relevant from irrelevant. There's nothing like getting halfway through a paper design and then realizing one key detail was missing, and a fresh design cycle must begin. If you're unlucky, there's still a key detail missing.

This overflow is what makes the change so abrupt. There's a sudden, discontinuous jump downward in speed because the system passes a critical point where it's too big to track. Normal development activity has to be rerouted to deal with all the work that has to be done to make "small" changes to the code, and it becomes a huge drain on speed, productivity, and morale. It's no fun to work obviously beyond our capabilities, and the loss of productivity means the speed of accomplishments (and their associated rewards) diminishes as well.

Anticipation

If development must slow when an application reaches a certain size, is there something we can do to stop it from becoming so large in the first place? Could we see this complexity barrier coming, and try to avoid it?

I'm not sure such a thing is possible. Stopping development to go through a shrink phase when the critical point is approaching would require us to be able to see that point before it arrives. The problem is that complexity is easier to manage as it builds up slowly. It's not until some amount of forgetting has happened that we are confronted with the full complexity.

Also, the tendency to break a system down into components or subsystems, and assign separate people to those systems, allows the complexity of the system as a whole to run far ahead of the individual subsystems. By the time we realize the subsystems are out of hand, the whole is practically tangled beyond repair. Besides, your manager probably doesn't want to spend time repairing it even if it wasn't that big.

Familiar Conclusions

No matter what angle I try to approach improving my programming skill from, I keep arriving at the same basic conclusion: that the best systems involve some sort of core, controlled by a separate scripting or extension language. The oldest success of this approach that I know of in common use today is Emacs, which embeds a Lisp dialect for scripting. Having actually made a script for Vim, I have to say that using a ready-made language beats hacking together your own across a handful of major revisions of your program.

I've really begun to see the wisdom in Steve Yegge's viewpoint that HTML is the assembly language of the Web. With SGML languages, document and markup are mixed together, and most of the HTML template you coded up is basically structural support for the actual template data. Even in a template-oriented language like PHP or Smarty™ built on top of HTML, you're forever writing looping code and individual table cells. With a higher-level markup language, you could conceivably just ask for a table, and have it worry about all the exact row/cell markup.

The other major option to reduce the complexity of Web applications, which has apparently been enthusiastically received already, is to put libraries into the languages we have to make them more concise for the actual programming we do. One obvious effort on that front is Prototype, which smooths over (most) browser incompatibilities and adds a number of convenient ways to interact with the Javascript environment. Prototype-based scripts are barely recognizable to the average JS programmer. At what point does a library become an embedded language, if that's even a useful distinction?

In the end, understandable programs come from small, isolated parts. Pushing all variables in the known universe into a template one-by-one is not as simple as providing an interface that lets the template find out whatever it needs to know. Laying out HTML by hand is not as concise as sending a cleaner language through an HTML generator. (And no, XML is not the answer.) Libraries can help, but sometimes, nothing but a language will do.

No comments: