Sunday, May 11, 2025

Thoughts from Trying Generators in PHP

I am late to the party, but I have been playing with Generators in PHP more, and running into the limitations of module boundaries.

Some module might produce a Generator so that iteration can be performed in chunks, reducing peak RAM.  For example, producing results one store at a time, instead of loading up all stores into a giant array.  Code that processes an entire database table, but wants to lower lock contention and memory use can also benefit; it can use a Generator to isolate the fetch-in-pages logic from processing the individual records.  The consumer sees one stream of results, while the Generator fetches more as needed.

In short, there are plenty of use cases.

The problem comes when a caller wants to pass “the data” produced by the Generator to another function or method that specifically takes an array.  Once that happens, either the destination needs to be reworked to accept the broader iterable type, or the efforts toward efficiency are erased by an iterator_to_array() call.

(Of course, back when generators were introduced to PHP, I didn’t use type declarations, so I could have gotten away with throwing a generator at something that assumed it would receive an array or PDOStatement. Dealing with larger teams and beginning to use an IDE were both great reasons to add the type information, and the array type forbids passing a Generator in its place.)

A separate issue is that anything consuming a Generator (thus, anything type-hinted iterable) needs to be aware of its once-only nature.  This only sometimes becomes a problem—for instance, if a template wants to output the data set and also some aggregate statistics over it for display before the main output.

Generators can also produce “return” values, which can be fetched by code that knows it is dealing with a Generator after the regular values are produced.  (I might change my mind later, with more experience, but it doesn’t pass the vibe check.  It feels a lot like requiring methods of a class to be called in a specific order, which is usually best to avoid.)  It implies that the entire system should lean into handling Generators in particular, and not allow them to mix with other iterable types.

These are (mostly) things I was vaguely aware of from reading about Python generators, but they weren’t on my mind while writing PHP.

No comments:

Post a Comment