Wednesday, September 4, 2013

More FastCGI: Apache 2.4, PHP-FPM, PSGI, and hot deployment

Driven by mod_fcgid's failings at graceful restart and preforking, I've been looking hard for alternatives.

tl;dr

External FCGI is best done via mod_fastcgi on Apache prior to 2.4, or mod_proxy_fcgi on 2.4.  mod_proxy in general is super awesome on 2.4.

php-fpm ships with an init script and reloads gracefully via SIGUSR2, so that's all you really need there.

For Perl, gracefully restarting an FCGI app is difficult, so the better approach for PSGI apps is to run them in an HTTP server with Server::Starter support (e.g. Starman or others).


mod_proxy_fcgi

Although mod_fastcgi no longer compiles out-of-the-box with Apache 2.4, there are two options: use Byte Internet's patch for it, or use the new mod_proxy_fcgi.  I'd obviously recommend the latter, since I prefer first-party code where available.

mod_proxy_fcgi allows for sending requests to external FastCGI servers, exactly as FastCgiExternalServer does for mod_fastcgi, except that it's also integrated into Apache's proxy pool/balancer ecosystem.  The one complication, as compared to mod_fcgid (which is still my current deployment strategy in spite of its warts), is that the mod_proxy way doesn't have any support for spawning the external application from the webserver.

In the case of PHP, this is not a difficult problem, because php-fpm ships with an init script already.  If you're building from source, I hear it's under sapi/fpm/init.d.php-fpm after building.  Likewise, it has built-in graceful reload support by delivering SIGUSR2 to the master process.

And while init scripts (or upstart jobs) for Perl are something I can handle writing, the question of reloading requires careful consideration.

Graceful Reloads in Perl/FCGI

Diving into the source, it seems that Plack's -s FCGI option corresponds to loading Plack::Handler::FCGI which connects FCGI to the PSGI world, and manages a worker pool (sized by the --nproc N option to plackup, default of 1) using FCGI::ProcManager.  Likewise, a --pid filename will also be passed to the manager for it to write its pid to the given filename.
For its part, when the manager (parent) process receives SIGHUP, it issues SIGTERM to all children, then respawns them.  This means that the children need to gracefully exit on receipt of SIGTERM.  It does not look like Plack gives this to you on its own.  I'm not sure if there's a reliable (race-free) way to implement signal handling either, as eventually there's a gap where the app passes responsibility back to the server, but the response may not be finished.  That is, imagine the order of events:
  1. local signal handler installed;
  2. return $psgi_response;
  3. scope exits, un-installing local signal handler;
  4. SIGTERM delivered;
  5. process dies before the PSGI server could send the response it just received from the app.
The same issue (the app doesn't know when the server has finished with the response) plagues the alternative busy/shutdown flag system and permanent global signal handler.  There's no way for $busy to accurately reflect whether there's a client on the other end of a connection waiting for a response, so a signal in step 5 still causes a disruptive shutdown.

Incidentally, Plack's integration with FCGI::ProcManager means an extra wasted process if you run plackup via some other FastCGI process manager such as mod_fcgid: there's no point to a manager under the manager to manage one child.

Graceful Reloads in Perl/HTTP

All of the above ended up being hours of research for naught, however.  There's a better way, and it's called Server::Starter.  There are actually a number of PSGI servers which play nicely with Server::Starter and plackup already, and which serve the app as HTTP.  (Related: Miyagawa's cheat sheet on PSGI server implementations.)

HTTP isn't actually a big deal.  If we're using mod_proxy to do all the work of forwarding to our external application in the first place, there's no reason not to use HTTP.

Server::Starter manages the listening socket/port, then pre-forks workers to handle incoming connections.  Workers inherit the listening socket via file descriptor, and the descriptor is referenced in the environment so that the worker can find it (and know it's running under Server::Starter).  Thus, a graceful reload involves spawning new workers, then requesting a graceful exit of the old ones.  That happens via SIGQUIT which Starman handles itself.

It's all very nice and tidy, and far more reliable than FCGI: if the new workers fail to start, the old ones are still running, and disaster has not struck.  Likewise, the server does the signal handling, where it can do it better than the app.

No comments: