After some discussion with @rowen, @erykoff, @KSK, and @jbosch, I think it’s time to start sketching out a new reference catalog interface. This conversation was spawned in part by RFC-535 when we realized that the *_camFlux
fields in refcats were in fact currently being used as aliases, and in part by my annoyance at how we manage colorterm corrections to refcat fluxes.
Present state
Currently, the filter map (“use PS1 z for LSST y”) and colorterm (“take this combination of PS1 i and z to get a more correct LSST z”) corrections have to be configured and applied inside the tasks that use them (currently photoCalTask
and jointcal
, soon to be fgcmcal
). This results in duplication of task configurations. Also, if a non-Task user loads a reference catalog, there are non-trivial steps required to get the “correct” fluxes for their desired filters, although DM-13054 somewhat improves the situation. As of DM-13054, to get the most correct available fluxes requires finding and loading a ColortermLibrary
for the camera of interest, creating a Colorterm
object from it, calling colorterm.getCorrectedMagnitudes()
and passing those magnitudes around with the loaded refcat.
Sketch of a new system
My vision is that when a user loads a reference catalog (via either loadSkyCircle()
or loadPixelBox()
), the resulting in-memory catalog has all relevant corrections applied to it. That way, the user can get at the appropriate fluxes for their filters of interest without performing further transformations on the reference catalog, or having to know anything about the reference catalog fluxes themselves. This would not change the on-disk reference catalog representation.
This would require either that 1) the LoadReferenceObjectsTask
be instantiated with the camera configuration required to correct it, or 2) that an external method/Task be called with the loaded refcat and the camera configuration to correct the refcat to that camera. We currently are setting *_camFlux
aliases as part of the filterMap: we could re-purpose those (new name suggestions welcome, though) to be actual fields that contain the corrected fluxes. I don’t know how compatible option 1) is with the new gen3 reference objects task, but it is certainly the most straight-forward option from a refcat user’s perspective.
I know that @jbosch has plans for further SED and Transmission corrections, and I am curious those ideas line up with this. Certainly, ensuring that the loaded refcat has all relevant corrections applied would make it easier for code to immediately make use of future advanced corrections.
Performance questions
Applying all necessary colorterm and other corrections on reference catalog load does increase the compute requirements when loading a refcat. However, refcat loading is a small fraction of our compute time, and these calculations are an even smaller part of that: my recent changes to colorterms in DM-13054 sped it up by ~20%, but that change was basically immeasurable in tests of the reference catalog handling part of PhotoCalTask. For ap_pipe, we can pre-load the reference catalogs, so it should not matter to the 60 second budget.
Timeline?
The DRP team has already produced a new ReferenceObjectLoader as part of their gen3 work: It would be good to nail down a new API before that code goes into full production, even if all of the above features are not yet in place. We could then add transmission and SED corrections to it as desired in the future.