Decoded Node

Monday, October 29, 2012

When Layering Goes Bad

A lot of systems are built in layers. Games are often split into engines and scripts. Another classic is the "three tier" architecture with data storage, application model, and view-controllers split out onto individual machines.

But more often, I run into systems where code is taking a core sample instead of building on the layers.

Read on ⇒

Thursday, October 25, 2012

War Story: Apache, SSL, and name-based vhosts

Note: this post was written about a year ago, before we completed some major upgrades to our infrastructure. I meant to post it as soon as we were done, but it got buried under too many other posts and drafts. The original post follows, without edits for temporal accuracy.

You can do it The Right Way and use SNI if:

You don't care about Internet Explorer (7 and 8) on Windows XP.
You have Apache 2.2.12 or newer.
You have openssl 0.9.8f or newer with TLS extensions; extensions are included by default in 1.0.

If all of these are okay for you, then setup is a matter of enabling NameVirtualHost *:443 and setting up TLS vhosts in much the same way as regular vhosts.

Otherwise, you have to try a bit harder.

Read on ⇒

Tuesday, October 23, 2012

Labels

I figured out my underlying problem with Yegge's liberal/conservative (libertarian/authoritarian) division of programming cultures.

People like looking down on those considered inferior. "Conservative" adds another way to do just that.

Tuesday, October 2, 2012

Compile Time

You might have heard that in Lisp, the whole language is there all the time. You can read while compiling, eval while reading, and so on. This isn't necessarily exclusive to Lisp—Perl offers BEGIN/CHECK/UNITCHECK—but it isn't exactly common in mainstream languages.

At first, it sounds brilliant. "I can use my whole language to {read the configuration | filter some code on-the-fly | whatever} for super fast run-time performance!" But there's a consequence that nobody seems to realize until they've gone far down that path: if you have a compile-test switch like perl -c, you can no longer guarantee that using it is safe if you wrote code that runs during compilation.

This is almost a trivial statement: compile testing has to compile the code; you're running code at compile time; ergo, your compile-time code will run. But beware of the details:

If you read your configuration files and exit if something's wrong, then you must now have a valid configuration to run a compile test.
Generalizing the previous: if you pull anything from an external service, your compile test depends on that service being up. It may also depend on having your credentials for that service available.
If you do a ton of work to prepare a cache for runtime, you have to wait for that—then the compile test finishes and throws it all away.
If you have an infinite loop in compile-time code, the compilation test never completes. Not a problem for a human at the keyboard, but could be difficult in a script (e.g. VCS commit hooks).
If the language allows you to define reader macros or source filters at compile time, then you can't even syntax-check the source without running the compile-time code; the lex phase now depends on the execution state that accumulates during compilation.
If your code assumes the underlying platform is Unix because that's what the server is, you can't compile test on Windows. Or, you have to write your whole compile phase cross-platform.

If you want to execute expensive code or do sanity checks before run time, consider carefully where they would best be placed. Perl's INIT can give you the same "run once, before runtime" behavior without affecting a compile test. Separate, automated tests can be configured to interrupt neither your compile checks, nor the production system on failure. (Sometimes, a 90%-working production system is desirable, compared to 0%.)

Friday, September 28, 2012

An Exercise in Optimizing PHP

Last winter, I was optimizing a PHP reporting script for no real reason besides practicing optimization.

Read on ⇒

Tuesday, September 25, 2012

Radical Clojure

Apparently, last month I missed Yegge's post and followup regarding software liberals and conservatives.

One of the things that caught my eye, that's clearly a Nerd Trap but so powerful that I need to answer anyway, is this little quote:

But the reality is that Clojure is quite conservative. As far as I'm concerned, if you can't set the value of a memory variable without writing code to ask for a transaction, it's... it's... all I can think of is Morgan Freeman in the Shawshank Redemption, working at the grocery store, complaining "Forty years I been asking permission to piss. I can't squeeze a drop without say-so."

Emphasis and vivid FUD in original.

There's just one missing word, though, that makes all the difference:

You can't set the value of a shared memory variable outside of a transaction.

Shared. Global. Possibly still in use by someone else.

Clojure is utterly pointless without understanding that time is explicit and everything is an immutable value (unless you have a Java native thing.) Values last, unmodified, until a reader asks for an update, and so a writer must be forbidden from modifying (destroying) that memory. There's a whole paradigm hiding there, which you can see again if you look at Datomic. Clojure without immutable values might as well be JRuby.

The other trick up Clojure's sleeve is that 'transaction' implies something rather more heavyweight outside of Clojure than in it. Again, because immutability is pervasive, Clojure's transactions don't have to do read tracking. When anyone reads an old value, it's going to stay what they read. It's like RCU, conceptually, except that every read is protected and nobody needs to copy because the messy details are taken care of on write.

And if it's not actually shared? Then you should use a var, which only one thread can see, and therefore doesn't need to be updated in a transaction!

Clojure is an interesting language. I'd still recommend it.

Added 26 Sept: I think that liberal/conservative as applied by Yegge divide the languages according to how much a language insists on its own philosophy. Things like Erlang and Clojure get ranked "conservative" right in with Pascal because they have Ways to Do Things, even if they are progressive ideas (functional and parallel/async are strongly encouraged.) Then Perl is more "liberal" because it's equally well-suited to OO, procedural, functional, and concatenative programming, i.e. just barely, and if it didn't have regexes as syntax, it would be long dead already. Python is more conservative than Ruby despite sharing characteristics because "there should be one—and preferably only one—obvious way to do it." That's a clear Conservative value, right in the middle of the Zen of Python.

In any case, the insight that everyone thinks they're liberal is accurate. Upon some thinking, which is hard and time-consuming and therefore not generally applied on the Internet, "conservative" is a fair enough label for my tendencies at this point in time—because I got burned throwing around too much fire, then came off that job a few months later to take positions in more conservative cultures. But I've never thought of myself as a stodgy dinosaur programmer.

Monday, September 24, 2012

AWS: as-* command parameters

It turns out that the auto scaling commands like as-create-auto-scaling-group are thin wrappers around the AWS SDK for Java; you can read all the commands placed in /opt/aws/apitools/as/bin but they eventually just invoke Java with a classpath set and invoke com.amazon.webservices.Cli or so.

Thus, the CLI commands are essentially documented by the Java API Reference. In particular, the as-* family reflects operations on com.amazonaws.services.autoscaling which consume data types in com.amazonaws.services.autoscaling.model.

What I was specifically looking for was documentation on the --health-check and --grace-period operations, and ....model.CreateAutoScalingGroupRequest finally has me covered. Health check can be either 'EC2' or 'ELB', to select either the standard (and presumably default) EC2 "is the host hardware alive? can we ping the vm?" instance checks; or the "can we access this URL on the instance's http server?" health check configured in setting up an elastic load balancer. Respectively.

Likewise, --grace-period is the time delay between starting up a new instance, and when auto scaling is allowed to start asking for its health.