Hi, I’m working on real/bogus classification, and I’m testing different quantities in the LSST pipeline source catalog. I find that the performance of a quantity called “base_SdssShape_instFlux_yy_Cov” seems to be quite difference from “base_SdssShape_instFlux_xx_Cov” and “base_SdssShape_instFlux_xy_Cov”. It is a quantity in both the src catalog of a direct image, or the diaSrc catalog of a difference image.
The information in the FITS table header is like this.
TTYPE54 = 'base_SdssShape_instFlux_yy_Cov' / uncertainty covariance between base
TFORM54 = '1E ' / format of field
TDOC54 = 'uncertainty covariance between base_SdssShape_instFlux and base_Sds&'
CONTINUE 'sShape_yy'
TUNIT54 = 'count*pixel^2'
TCCLS54 = 'Scalar ' / Field template used by lsst.afw.table
I notice that bright transients usually have large values at “base_SdssShape_instFlux_xx_Cov” and “base_SdssShape_instFlux_xy_Cov” but small values at “base_SdssShape_instFlux_yy_Cov”. However, in theory covariances should not be strongly related to direction.
How does the pipeline compute those SDSS shape flux - 2nd moment covariances? I was not able to find much information about that.
Any suggestions will be appreciated.
This is the C++ code for the algorithm, most of which goes back to ~2014.
Note that we are moving away from SdssShape to HsmShape. I’ve asked about the documentation for that (it was just rewritten in python). That package is here:
Thinking about this some more: what do you mean by “the performance of”, in your question? Because I could believe that the xx and yy shape covariances do encode information about which sources are bogus. y is the column direction, so bad columns and bleed trails would probably have quite different yy_Cov than less pathological sources.
Thank you for pointing me to those webpages. The SdssShape.cc page shows those flux - moments covariances are derived from a 4D Fisher matrix. In my question, “behavior” could be a better word than “performance”. What I found is that bright sources usually had large Flux_xx_Cov and Flux_xy_Cov values, but small/random Flux_yy_Cov values, and I’m trying to understand why. Thanks for pointing out the issues of bad columns and bleed trails – I think that’s probably the reason.