Coaddition Tasks Redesign RFD Request

yusra · July 21, 2017, 4:56pm

Proposed date: Tuesday August 1 12:30 pm to 2:00 pm Pacific.
Connection: https://bluejeans.com/426716450
Suggested audience: DM Science Pipelines

Requirements for Coaddition Tasks Redesign

I’m gathering a list of stakeholders and requirements for a coaddition redesign. A number of use cases have popped up recently that demonstrate that the current hardcoded assumption that we’ll run assembleCoadd once with one config is insufficient, and the time to update it is now. You may have noticed that the number of coadd data dataProducts for every configuration of coadd we plan to generate (e.g. deepCoadd, goodSeeingCoadd, deepCoaddPsfMatched, deepCoaddLikelihood etc…) have been proliferating in our obs packages. The main feature of this redesign is to move these coadd descriptors (which signify configs with which assembleCoadd was run) from the data product name into the data ID.

Stakeholders and Use Cases

The following tasks are awkward given the current framework and could benefit from a redesign. I’d like to make sure the the design meets the needs of these use cases:

Make direct/psfMatched, deep/bestSeeing etc… coadds in one DRP (without --clobber-config!). In LDM-151 these are called DeepCoadd, BestSeeingCoadd and ConstantPsfCoadd. We may want different modelPsfs for different purposes (e.g. deblending, artifact rejection) too.
Implementing artifact rejection in deepCoadds. This requires a PSF-Matched Median coadd as an intermediate data product.
Generating the series of ShortPeriodCoadds. @jbosch indicated that @boutigny was working toward creating these now.
TemplateCoadds for Alert Production binned by parallactic angle or wavelength to account for DCR. @isullivan
Your use case here.

##Sketch of Coaddition Interface Changes:

Because it is easier to find gaps in a prototype design than recalling your requirements to me, here’s the strawman schematic of what I’m envisioning:

1. Designate coadd types in data IDs:

Proposed Keys (analogous to “patch” or “filter”):

"warpType": ('direct', 'psfMatched', 'likelihood')
"seeingCutoff": ('deep', 'goodSeeing')

# For ShortPeriodCoadds:
"baseline": ("y1", "y2", "y3", ... , "y10") 

#  For TemplateCoadds:
 "wavelength": (?)
 "PA": (?)

On obs_base the coadd data products could collapse into one for every parent task (see below):

coadd:
    persistable: ExposureF
    storage: FitsStorage
    python: lsst.afw.image.ExposureF
    template: coadd/%(filter)s/%(tract)d/%(patch)s/%(seeingCutoff)sCoadd%(warpType)s.fits
    level: Skytile

to be retrieved like:

butler.get("coadd", filter='g', tract=0, patch="0,0", warpType="direct", seeingCutoff="deep")

2. Add parent coaddition task

There is still a problem of how to generate the full suite of coadds without using --clobber-config.
Currently, a single DRP only allows you to run assembleCoadd.py once with with one config. To make multiple coadd types with different configs, they must be output in different repos. In general, Pipetasks take an input data ID and a config to produce an output data product (with dataIDs constructed from the input --id). The input and output data product is implicit in the Task. The input (and sometimes output e.g. makeCoaddTempExp) data ID is specified by the user.

Proposal: Add a parent coaddition task “CoadditionTask” (to be replaced by a supertask I’m sure) that takes the dataIds you want to produce. It’ll make the appropriate calls to the subtask assembleCoadd. The call signature would look like:

coadditionTask.py input/repo \
--id filter=g tract=0 patch=0,0 warpType=psfMatched^direct seeingCutoff=deep

To make this backwards compatible, warpType=direct and seeingCutoff=deep can become implied defaults so that users in the habit of leaving those off still can.

This will not preclude the existence of other parent coaddition tasks, such as a “makeTemplateCoadd.py”

Questions for Discussion

I’ve left out a number of details to be discussed. For example, where and how are these Coadd Keys defined? How does a developer add a new Coadd Key? How flexible? How does CoadditionTask convert these keys into configurations and calls to assembleCoadd?

TL;DR The coaddition task interface is undergoing a redesign. Please comment if you’d like to attend the RFD or have requirements, use cases or ideas you’d like to share.

hsinfang · July 21, 2017, 5:45pm

@yusra I won’t be able to attend on Aug 1 but I will be interested to understand the conclusions of this RFD. In particular I would like to understand what this means in the production workflow (before they become supertasks). Your proposals in coadd keys and the parent coaddition task sound fine to me. Do I understand correctly that (1) this CoadditionTask will be a CmdLineTask in the short term (2) each warpType + seeingCutoff combination is done independently underneath the parent task?

yusra · July 21, 2017, 10:43pm

@hsinfang I’ll post a summary (and a formal RFC) after the discussion.

And you understood the intent right, this parent coaddition task will be responsible for looping over the new coadd dataIds IF we want changes to assembleCoadd to be backwards compatible.

yusra · August 1, 2017, 7:10pm

Coordinates:
https://bluejeans.com/426716450

Connecting directly from a room system?

Dial: 199.48.152.152 or bjn.vc
Enter Meeting ID: 426716450

Just want to dial in on your phone?

Direct-dial with my iPhonetel:+1.408.740.7256,,#426716450# or
+1.408.740.7256tel:+1.408.740.7256 (US)
+1.888.240.2560tel:+1.888.240.2560 (US Toll Free)
+1.408.317.9253tel:+1.408.317.9253 (Alternate number)
(all numbershttp://bluejeans.com/numbers)
Enter Meeting ID: 426716450
Press #

yusra · August 29, 2017, 5:54pm

Thank you @jbosch , @ktl , @rowen, @isullivan for the valuable input.

Additional requirements were collected in response to the ideas presented in https://github.com/yalsayyad/dm_notebooks/blob/master/coaddition/CoadditionRefactor.ipynb

To summarize the additional concerns, requirements, and ideas raised by the participants:

Preference emerged for one key vs. many keys. I propose we use the existing (but unused) “coaddName” as the key. This change will be backwards compatible with existing repositories if we give the default the name: “deep.”
For a given rerun, we want to be able to add new mappings between coadd names and configs. (i.e. add new “coaddNames”)
In addition to mapping coaddNames to configs, we want to be able map the coaddName to a coadd class, to be able to construct coadds using a variety of algorithms.
It would be nice to be able to query this mapping between keys and configs.
Separate processes of registering new units of data vs. creating new data outputs.
Valid uses of data ID keys included grouping and now defining a config
Requiring people to know what the available coaddNames are is scary. Need a nice way to look them up.
Ideas for advanced queries: good to be able to query for all coaddNames that include a particular epoch. This is hard to do before the coadd is made. This functionality is in SelectImagesTask, which is separate from configs.
Where should this mapping live?
The same coaddNames will map to different configs in different repos. We need to make everyone aware of that, the same way that everyone is aware that the configs can found in config/.py files and different repos have different skyMaps.

Next step is a prototype. Look out for a future RFC.