Requirements for overhauled calibration task?

jbosch · January 22, 2016, 9:11pm

Thanks for the clarification. I had misunderstood this statement in your earlier post:

Now that (I think) I understand you better, I do still think it would be better to make CalibrateTask solely responsible for writing calexp, so calexp is not written when CalibrateTask is not run, and for ProcessCcdTask to warn in its config validatation when CalibrateTask is not being run but icExp is not being written. Does that make sense? I’m worried about us writing something that claims to be a calexp, but is actually an icExp, and actually isn’t useful for any of the downstream processing that needs a true calexp.

None of the options here jump out to me as obviously better than the others.

If we put the deep DetectAndMeasureTask run in CharacterizeImageTask, I’m worried that CharacterizeImageTask gets too complex in all the schemas and DetectAndMeasureTasks it needs to juggle, especially because that last DetectAndMeasureTask run is really only needed to feed CalibrateTask. But that may all be outweighed by the convenience it provides in allowing users who don’t want to run CalibrateTask to easily get a deep measurement catalog.

So if we think about how things fit together in a full pipeline, the deep DetectAndMeasureTask run clearly goes in CalibrateTask, since we’re doing it simply to feed those calibration routines (and, eventually, a bigger calibration routine like the one in meas_simastrom). I think I have a slight preference for this approach, but I’m willing to be overruled by @nidever if he’s concerned about not having an easy way to get to deep detection and measurement without going all the way to astrometric matching and photometric calibration.

rowen · January 22, 2016, 9:23pm

I like this argument. Also, having one schema per task is much nicer.

As to running calibrate or not, and writing calexp, I know of these uses cases:

The image has excellent astrometry; don’t do astrometric calibration
We don’t have a reference catalog so we can’t do astrometric or photometric calibration

These suggest the following solution:

CalibrateTask contains two flags: doAstrometricCalibration and doPhotometricCalibration
CalibrateTask always writes calexp even if those two calibration steps have not been performed

I am open to other ideas.

jbosch · January 22, 2016, 9:33pm

The only case I’m aware of where the astrometry is already good is SDSS, for which the photometric calibration is also already good. If that is indeed the only such case, I’d be tempted to continue to have an SDSS-specific CalibrateTask to load and attach those to the exposure (it might now be more appropriate to do it here rather than earlier). Since CalibrateTask doesn’t have to do nearly as much as it used to, I think the downsides of having a camera-specific override would then be considerably lessened.

I had assumed that if we don’t have a reference catalog, we would just not run CalibrateTask itself at all (perhaps by running IsrTask and CharacterizeImageTask separately, perhaps by having a doCalibrate flag in ProcessCcdTask), but I’m not opposed to your proposal with two flags within CalibrateTask, and it does perhaps provide a better solution for users who do want deep measurements but don’t have a reference catalog.

Do all combinations of those two flags (e.g. doAstrometricCalibration=False, doPhotometricCalibration=True) make sense?

rowen · January 22, 2016, 9:36pm

photoCalTask is not yet independent enough to do its own matching, though in the long run it surely will, since we want to be able to use separate astrometric and photometric catalogs. We can still have two flags if we want, but they won’t be fully independent yet.

Note that the current CalibrateTask has 4 flags: doAstrometry, doPhotoCal, requireAstrometry, requirePhotoCal. The latter two cause the task to raise an exception if the step fails This seems rather complex and I don’t see any code or config overrides that set the require flags true. Really these are tri-state: don’t do it, do it on a best-effort basis, do it and give up if it fails.

jbosch · January 22, 2016, 9:40pm

How about we do matching if either flag is set, but only solve for and attach a new WCS if doAstrometricCalibration=True?

rowen · January 22, 2016, 9:55pm

I’d be happy to do that if desired. It clutters the task a bit, but not horribly so. The clutter could go away when we support separate astrometry and photometry catalogs. Do you have an opinion about the “required” flags?

jbosch · January 22, 2016, 10:00pm

I’m actually surprised to hear that the default behavior is to continue (with a warning, I’d guess) if any of the calibration steps fail. In a steady-state, full-scale processing mode, I think we want to fail by default and catch that at a higher level, because it doesn’t make sense to produce a calexp when you haven’t actually managed to determine its Wcs or zeropoint. But I can also imagine that being frustrating for everyone trying to get the stack working on new cameras.

I think that means we should probably leave those “require” configuration options, and have the debate about what the default should be some other day.

rowen · January 22, 2016, 10:20pm

I propose to change the meaning of the “require” flags slightly. They will default to True and will be ignored if the corresponding “do” flag is False. That makes it trivial to enable or disable astrometry and photometry, but if they are enabled then they must run, by default.

nidever · January 22, 2016, 11:02pm

Interesting. Why are you designing it in that way? And will there then be a second stage that does PSF determination with a catalog of “PSF stars”?

Thanks for all the other responses. Those answered my questions.

jbosch · January 23, 2016, 12:15am

Yes, there will be a second stage that does PSF determination, and the reason we’re doing things this way is because I think we’ll need to generate that catalog of PSF stars at least to some degree from our own processing. And I think that’ll need to happen after or as part of our big internal relative astrometry fit (i.e. meas_simastrom), because at that point we’ll be able to:

get secure more star/galaxy classifications by combining information from multiple epochs;
get some estimate of the SED from the colors (needed for wavelength-dependent PSF modeling);
improve our estimates of the true positions of the stars, including proper motion.

So once we’ve done the internal relative astrometry fit, we’ll be able to go back and make better PSF models. I’m hoping that we don’t actually have to do any further processing (e.g. coadds) to build the full PSF star catalog, but I think we’ll have to go at least through the relative astrometry fit.

nidever · January 23, 2016, 12:35am

Ah, okay. I was thinking we’d use Gaia for all of this since we get good star/galaxy separation from morphology (and, of course, great astrometry) for stars that should be bright and faint enough to serve as our “PSF stars”. Is there a reason that might not be the case?

Either way it’ll be good to have the capability to make our own list of PSF stars.

jbosch · January 23, 2016, 12:57am

We’ll certainly use Gaia stars, but I think we’ll probably want to go fainter.

rowen · January 23, 2016, 7:19pm

I updated CalibrateTask and DetectAndMeasureTask to handle catalog-based star selectors. It was a small change, though it requires a small RFC for adding a usesMatches method to StarSelector.

afausti · January 26, 2016, 6:05pm

@jbosch, @rowen I noticed this behaviour changed recently in the current ProcessCcdDecam.

In w_2015_40 when the astrometry.macther failed I had a RuntimeError

processCcdDecam.calibrate.astrometry.refObjLoader: Loaded 1420 reference objects
processCcdDecam.calibrate.astrometry.matcher: filterStars purged 0 reference stars, leaving 1420 stars
processCcdDecam.calibrate.astrometry.matcher: Purged 5 unusable sources, leaving 649 usable sources
processCcdDecam.calibrate.astrometry.matcher: Found 0 usable matches, of which 0 had good sources
processCcdDecam FATAL: Failed on dataId={‘visit’: 205484, ‘ccdnum’: 1}: Unable to match sources

File “/home/afausti/lsst/Linux64/meas_astrom/11.0-1-g60db491+6/python/lsst/meas/astrom/matchOptimisticB.py”, line 310, in matchObjectsToSources

raise RuntimeError("Unable to match sources")

RuntimeError: Unable to match sources

Now, using the latest tag w_2016_03, it continues with a WARNING

processCcdDecam.calibrate.astrometry.refObjLoader: Loaded 1420 reference objects
processCcdDecam.calibrate.astrometry.matcher: filterStars purged 0 reference stars, leaving 1420 stars
processCcdDecam.calibrate.astrometry.matcher: Purged 17 unusable sources, leaving 637 usable sources
processCcdDecam.calibrate WARNING: Unable to perform astrometry (Unable to match sources): attempting to proceed
processCcdDecam.calibrate WARNING: Failed to determine photometric zero-point: No matches available
processCcdDecam.detection: Detected 2645 positive sources to 5 sigma.
processCcdDecam.detection: Resubtracting the background after object detection
processCcdDecam.measurement: Measuring 2645 sources (2645 parents, 0 children)

For my tests it is more useful if it raises a RuntimeError and exits.

Is there a way to control this behaviour? what are the “require” configuration options?

afausti · January 26, 2016, 6:49pm

Hi,
@nidever help me to indentify the configuration options, they are:

config.calibrate.requirePhotoCal=True
config.calibrate.requireAstrometry=True

which I set to True. Thanks!

rowen · January 27, 2016, 6:41pm

I am now trying to identify all measurement algorithms needed. CharacterizeImageTask should make all measurements required to determine a PSF. What measurements are those? Beyond centroid, this seems to depend on which star selector and PSF determiner are used by MeasurePsfTask.

Star selector options include:

CatalogStarSelector: hard-coded to use PSF flux
ObjectSizeStarSelector: configurable; defaults to base_GaussianFlux_flux
PsfStarSelector: configurable, defaults to base_PsfFlux
SecondMomentStarSelector (the default, despite using PSF flux): which is hard-coded to use PSF flux
(SizeMagnitudeStarSelector: I’m not sure, but we doubt it works in any case)

PSF determiner options include:

PcaPsfDeterminer (the default): no additional measurements used
PsfexPsfDeterminer: uses prefs.getPhotfluxRkey(), but I have not found any preference files, so I’m not sure what values this usually takes

In any case, it looks to me as if CharacterizeImageTask must measure base_PsfFlux (despite the incongruity) and base_GaussianFlux_flux (in addition to a centroid, of course). Does anything else come to mind?

Another consideration is what users expect to find in the icSrc data product. We can add more measurement algorithms if there is a need for them.

jbosch · January 27, 2016, 6:54pm

I think we want to run at least the following (there’s nothing else I can think of now, but that doesn’t mean this list is complete):

SdssCentroid: requires a Psf attached to the exposure, but we’ll have an initial guess one at least.
SdssShape: I expect this to be the primary input for most morphological star selectors.
PixelFlags: always need to run this.
GaussianFlux: we probably want something like a galaxy model flux, but we don’t want to run CModel until we have a pretty good Psf, as it’s slow. So someday, when we can configure later stages differently and actually use the Psf from the previous iteration, we’ll want to switch from GaussianFlux to CModel.
PsfFlux: like CModel, will only do a good job once we have a good Psf, but unlike CModel is fast enough to run regardless. And it seems to be bizarrely required by one of our StarSelectors (maybe file an issue on that?).
CircularApertureFlux: I can certainly imagine star selectors wanting a sequence of aperture fluxes, so even if the current set doesn’t need them, I think these are worth running (and they’re fast enough to not be a performance concern).

Also, the default StarSelector should be switched to ObjectSizeStarSelector, which was what we use on the HSC side. It’s received all the recent updates, as we’ve not really used SecondMomentStarSelector since it was introduced.

rowen · January 27, 2016, 9:29pm

@jbosch thank you very much. For the record, here are the default algorithms run by SingleFrameMeasurementTask:

base_PixelFlags
base_SdssCentroid
base_GaussianCentroid
base_NaiveCentroid
base_SdssShape
base_GaussianFlux
base_PsfFlux
base_CircularApertureFlux
base_ClassificationExtendedness
base_SkyCoord
base_Variance

Comparing this to your suggested list I see the following that are not on your list:

base_GaussianCentroid
base_NaiveCentroid
base_ClassificationExtendedness
base_SkyCoord
base_Variance

Of these, I would think base_ClassificationExtendedness might be of interest to star selectors (though I don’t think it is presently used). I’m not sure about the last two, but they look innocuous, as do the two extra measures of centroid. Unless any of these is particularly slow I wonder if I should not simply use the default set?

jbosch · January 27, 2016, 10:02pm

ClassificationExtendedness is only as good as your Psf model, and even then it’s a fairly trivial combination of slot_ModelFlux and slot_PsfFlux that I’d guess any StarSelector that used it would want to re-do more carefully.

SkyCoord cannot be run unless we have a Wcs (and all it does is set the Coord fields from the slot centroid). I don’t think we need to run Variance at this stage, or the two other centroiders.

rowen · January 27, 2016, 11:22pm

@jbosch thank you. Do you happen to know the syntax for changing the values for a RegistryField that accepts multiple values? I’ve tried the following in CharacterizeImageConfig.setDefaults(self) and neither works:

self.detectAndMeasure.measurement.plugins[:] = ["base_PixelFlags",...]

self.detectAndMeasure.measurement.plugins = ["base_PixelFlags",...]