This is another blog-like post like my last one, but this one is more relevant (despite being a bit historical), and it involves a problem I still think we have a chance to solve.
Once upon a time, the
afw.image.Exposure class was just a
MaskedImage with a
PropertySet header and a
Wcs. That made it pretty simple - everything about it had a nice, FITS-standard representation (we only had simple,
FITS-standard WCSs in those days) - but a bit inconvenient. If you wanted to use some other object that represented some other aspect of our knowledge about the image, like a
Calib, you had to pass it around separately. So it was a bit inevitable that we ended up attaching
Exposure, but that still left things pretty simple - it’s fundamentally just a zero-point, and we could put that in the headers, too.
When we added
Detector classes to the codebase, the practice of attaching everything we know about an
Exposure to that
Exposure was pretty well established (I think; this was a long time ago), but those presented a new problem: there wasn’t a FITS-standard way to save them in the same file. We didn’t save the Detectors at all - instead we had the Butler create a full
Camera object and attach a
Detector to every
Exposure on read. For a long time we just wrote the PSFs into separate files using Boost.Serialization, which wrote non-standard, vaguely-human-readable text files than none of us really liked (it could also do XML, which I suppose we liked even less?). That meant it was up to whoever was working with the
Exposure to load the
Psf separately and attach it.
That all changed in 2013, when we decided to write
Psf class that was implemented by lazily evaluating all of the
Psfs of all of the input images every time a PSF image was requested at a point on the coadd. That made it a much more complex and much larger data structure than any previous
Psf. That made persistence very tricky, and I didn’t want to try to solve that tricky problem with Boost.Serialization.
That was, in retrospect, probably a mistake: Boost.Serialization is actually excellent at saving complex data structures, and while there were some real problems with the code we used to interoperate with it and its output format was non-ideal, I didn’t really consider the possibility of trying to understand and fix those problems. Writing a new persistence framework from scratch sounded much more promising, and having just written
afw::table the year before, it made for very shiny hammer that made this problem seem very nail-like.
But even that wasn’t quite enough for me to decide that inventing a new persistence framework was a good idea. What really sold me on the idea was the thought of addressing three long-standing problems at the same time; in addition to implementing persistence for
CoaddPsf, we could:
- Start saving coadd provenance (what became the
CoaddInputsclass) - if you think about it, that’s essentially exactly the same information that
- Start saving
Psfs and other complex objects to the same FITS file as the Exposure. This is something @RHL did with his PSF models in SDSS, and we were all looking forward to not having to pass those separate PSF files around with the corresponding
afw::table::io::Persistable was born, a home-brew persistence framework that writes almost-arbitrary data structures (including polymorphic objects and shared pointer relationships) to a sequence of FITS binary table HDUs. And while the home-brew-newss of it was a mistake, it does have some very nice features. Writing
Exposure components to same FITS file is a big win for convenience, and the binary tables themselves are at least a little bit self-describing and language-agnostic; I have already written pure-Python, non-DM code to read some things from them, and I’m sure sufficiently-dedicated other people could do so as well (especially with some slightly better high-level docs from me). What I should have done was to adopt a mature serialization library (like Boost.Serialization), and try to write a FITS binary table archive back-end for it. I didn’t realize that until I was pretty far along, however, and by then it felt too late to back out.
Anyhow, along with with that new persistence framework came some big improvements to
Exposure: we split off managing the non-image components into a separate object,
ExposureInfo (which also had the new code for reading and writing them with
afw::table::io). We also added the
ExposureCatalog classes to make it possible to pass around
Exposure components and metadata in table form.
All of that work seemed to have left us in a much better state, and on the surface that was true. But at this point, I think we’d made a enormous mistake in structuring the code (in addition to writing our own persistence framework), and we’ve been paying for it ever since. Consider:
The new persistence framework (
afw.table.io) was part of
afw. It requires persistable objects to inherit from a base class (in C++) and use data structures defined in
afw::table. That means persistable objects need to be defined in or above
afwin the package dependency tree, and they need to be written in C++. That’s already pretty bad, but it gets worse.
Exposureclass is also part of
afw, and it holds its components as regular, strongly-typed data members. That means anything held by
Exposureneeds to be defined in or below
afwin the package dependency tree (or at least have a polymorphic base class that is - but most of these components are concrete, nonpolymorphic classes). You can probably guess where I’m going here.
For a class to be convenient (and, at some level, be non-surprising) as an
Exposurecomponent, it also needs to be persistable with
Exposure, and that means it needs to use
afw::table::io. And hence any class that represents the properties of an image must be be in afw, and written in C++. It’s no wonder that package has gotten so big.
But wait, that’s not all:
afw.table.SourceRecordplays a similar (albeit milder) role for individual astronomical objects; any data structure beyond tabular data that we want to associate with those objects needs to be known to it and persisted with it (right now that’s “just”
PeakRecord). So anything that plays that role needs to go in
afw(and be written in C++), too.
afw.table.io(a package lower bound) and
afw.table.SourceRecord(package upper bounds) are all not just part of
afw, but part of
afw.table. So any class that describes an image or an astronomical object in a persistent way actually needs to go in afw.table (or have a circular dependency with
afw.table, which is what we’ve actually done).
And here’s the kicker:
- Any time we add a new class that describes an image or an astronomical object in a persistent way we need to modify at least one of our core low-level primitives,
SourceRecord, and that includes updating their persistence code (and unless we have a very good reason not to, maintaining backwards compatibility with files we’ve already written).
We got to this point by doing what seemed convenient and trusting our own intuition and judgement as to what made for good interfaces, design, and code organization. The lesson I’ve learned since 2013 is that our judgement and especially our intuition in those areas are often terrible, especially given that most of us (in Science Pipelines, anyway) think like scientists, not software engineers.
Good design principles are often counter-intuitive, at least at first, and that’s especially true of the classic OO design principles we’ve very explicitly violated four (of five!) of here (which and how are left as an exercise to the reader). This is made more difficult by the fact that those design principles often encourage splitting software up into smaller units, and that exacerbates very real problems with packaging tooling and ecosystems that often (and legitimately!) feel much more important in the real world than adherence to abstract design principles (I think the discussion on RFC-539 is a perfect example of this).
Anyhow, I think there are a lot of ways out of this mess, and to first order I think they’re pretty straightforwardly derived from actually applying some dependency-inversion-principle in particular, but the devil is of course in the details…and the disruption. Given where we’re at now, any changes to these core classes is going to be quite disruptive. But if we ever want to get our package organization in order, or write more stuff in Python, or avoid churn at the bottom of the dependency tree, this is the hurdle we’ve got to overcome.