Flags in Object vs Source Catalog Schema

Onoddil · September 5, 2023, 4:19pm

A question for DP0 in addition to trying to use HSC data as an LSST-like precursor (with DP0 being actually-LSST-like as an additional reference), but with a focus on future LSST data previews and eventual data releases. Ultimately my question boils down to whether the DP0 schema are reduced in column number as compared with future releases or are these available columns something to begin building expected workflows from?

More specifically, within the Source schema there are available numerous flags for various fail-states within the pipeline processing (e.g. centroid_flag_almostNoSecondDerivative or psfFlux_flag_edge) that are not available in the Object catalog; here you only get access to the “General Failure Flags” of e.g. g_centroid_flag.
It would be beneficial to have access to the more nuanced flags in the coadded image outputs for quality cuts; are they not included out of necessity (either in DP0 or all future DRs)?

If it is not possible to include extra terms, where can I find more information on which sub-flags are combined to the generic failure flag? I have been looking at HSC as precursor data as well and have not found much to go on to elucidate on what possible errors may have occurred to trigger the “generic” flag. Further, even within the Source catalog there are fewer flags included in the DP0 schema than as compared with HSC PDR3 – are any of these extra HSC-only flags still rolled into the generic e.g. centroid_flag, psfflux_flag, cmodel_flag states (and just not included in the output DP0 schema)?

Finally, can I double check if the centroid_flag columns from Source are equivalent to the *_centroid_flag flags for each filter in Object in terms of sub-flag combinations otherwise? I.e., are they both the same in terms of being something like flag_A OR flag_b OR flag_C?

Thanks, Tom

ctslater · September 8, 2023, 3:44pm

I think you should interpret the flags available in DP0.2 as a first draft at what the the set of flag columns might look like, but there’s certainly a lot of improvements and refinements (and documentation!) that need to happen before the LSST data releases. So in general none of the flags were omitted out of strict necessity, and we can certainly reevaluate what flags are the most important ones to include.

It’s hard to give specifics without digging into code (a known deficiency!), but usually the general failure flags refer to the algorithm itself failing—it’s typically not a boolean combination of other flags. And on the last point, the centroid algorithm that runs on Sources and on Objects is the same, so the mechanics of the flags are the same.