Thursday, May 23, 2013

A Look at Failing Polymorphism

First, a brief recap of Steve Yegge's "When Polymorphism Fails" in case it moves again. Let's skip to the part where he builds an example in which polymorphism doesn't help the programmer, using a user-extensible online game as the premise.  A user wants to add an OpinionatedElf monster; how does she do it?
Let's say the OpinionatedElf's sole purpose in life is to proclaim whether it likes other monsters or not. It sits on your shoulder, and whenever you run into, say, an Orc, it screams bloodthirstily: "I hate Orcs!!! Aaaaaargh!!!" (This, incidentally, is how I feel about C++.)

The polymorphic approach to this problem is simple: go through every one of your 150 monsters and add a doesMrOpinionatedElfHateYou() method.

Gosh. That sounds insanely stupid.
Yegge then compares that to an alternative, implemented within the OpinionatedElf (who is the one doing the judgement in the first place):
public boolean doesElfLikeIt ( Monster mon )
{
    if ( mon instanceof Orc ) { return false; }
    if ( mon instanceof Elf ) { return true; }
    ... <repeat 150 times>
}
There's also the viable-in-Ruby approach of opening all the classes and adding the doesElfLikeIt method to each of the 150 monsters. But still, there's a lingering problem with all of the approaches so far:
If someone adds a Gremlin, your elf will be stuck screaming something like "Gosh, what's THAT?" until you can update his code to include a case for gremlins.
I ran into my own instance of this problem recently.

I'm converting among file formats (think something like midicomp) and I want to allow users to add new formats, potentially with features not supported by the library as shipped, and have them seamlessly convertible between the extended and built-in formats.  In other words, users should be able to add features, formats, or both, without crashing anything that was included with the library.

In a conventional OO design, it would seem that either the formats need to know about all possible feature-types, or that the features know how to represent themselves in all their representable formats.  Or the real nightmare, there's a tree with features × formats classes that are manually double-dispatched on the combination of feature and format.

The real sticking point of the problem is that the class hierarchy is hard-coded at ship time.  There's little opportunity for a downstream user to extend the structure "internally" without either editing the library herself, or resorting to some serious (and potentially very broken) changes to the class properties.  Example of the latter, php's runkit: "For all those things you... probably shouldn't have been doing anyway..."

The cleanest way I can think of to solve this combinatorics problem is through the use of a table.  Let the OpinionatedElf include a Hash<String,Boolean>, and then the code for doesElfLikeIt can simply check the hash table for the result of mon.class.name.  Offer other classes the ability to add themselves to the table, and the defaults problem is also reduced: the author of Gremlin can tell OpinionatedElf what to think of it.

In more duck-typed languages, this also gives the user freedom from the library's type hierarchy.  There's no real reason that Portamento must derive from ControlOpArg3, technically, so there's no reason to force the library user to do so.  It also makes the "how does the format render something" question independent of the type hierarchy—there's no need to teach the library about ControlOpArg4 because the concrete operation can be added to the table directly.

No comments: