How will users access Year One AP Templates

mwv · June 1, 2020, 2:39pm

How will users access the templates used in Year-One AP? Both (1) immediately but also (2) in the long-term.

Isn’t yet clearly answered, but I assume would be something relatively straightforward.
Will the templates be persisted explicitly?
a. If the templates are not persisted explicitly, how will the full set of software version and calibration data be tracked to allow regeneration?
b. Will there be a requirement that all images to generate a coadd used be done with the same software processing version? This might require re-processing a large set of images everytime one wishes to update the template.
c. If the answer to #4 is no, then both the software version(s) and the calibration data version(s) need to be tracked, and it should be possible to regenerate the calibration data with one pass through the data. Calibration is often iterative and it would be easy to get into a state that the full provenance of a calibration data product actually involved 5 passes through data and calibration products each with different versions of the science pipelines.
d. One might be tempted to have minor point updates that don’t require a full re-processing just to update a template; but this might make full provenance tracking hard – or even if done it might be unrealistic to think that that processing could really be practically regenerated.

MelissaGraham · June 3, 2020, 10:26pm

Thanks so much, @mwv, for posting these questions about year one templates. Below is a summary of what can currently be said with respect to template generation and provenance during the first year of operations, with a bit of extra context for readers new to this topic. Additional discussion and follow-up questions are welcome.

Templates During Year One, Compared to Later Years: In later years of the LSST the templates used for Alert Production (AP) will be part of the most recent data release. However, until Data Release 2 (DR2) – composed of the first full year of observations and thus the first full-sky template (DR1 will comprise the first half-year only) – AP must use template images that were not part of a data release. The details of how template generation will proceed during year one remain an open question (e.g., ls.st/dmtn-107). While it is possible that some template images for a given location might evolve prior to DR1/2 (i.e., be regenerated as the survey progresses), this will not be the default process. As described in S.1.4.6 of the Data Management System Requirements (ls.st/LSE-61): “Incremental template building enables Alert Production when no Data Release template is yet available. It is anticipated that … once a template is produced for a sky position and filter it will not be replaced until the next Data Release to avoid repeated baseline changes.”

Access to Templates: In all years, data products including template images will be accessible to data rights holders via the Rubin Science Platform. In the scenario where some year-one templates evolve with time (discussed above), if it would assist the community in achieving their science goals to persist all versions of year-one templates in the RSP (in addition to enabling their regeneration with provenance metadata, as discussed below), please discuss further in this thread and @MelissaGraham will follow-up with an internal ticket to investigate such an implementation.

Requirements on Images Contributing to a Coadd: There are no requirements on the Data Management System (DMS) regarding how coadds are generated, so there is no requirement that all of its contributing images be processed with the same pipeline version. However, for all coadds (including templates) generated as a part of a DR, all contributing images will have been (re)processed with the most recent software version. Given that year-one templates will be generated once and not replaced, it is likely that all contributing images would have been processed with the same pipeline version. Furthermore, there are DMS requirements that provenance metadata for processed images be stored, (e.g., including the identity of the input exposures and related provenance information for difference images; S.1.3.3.1 of ls.st/LSE-61), and to provide users the capability to regenerate data products (e.g., S.4.1.7 of ls.st/LSE-61).

A Side Note on Alert Packet Contents: In all years the alert packets will contain postage stamps: small cutouts of the template and the difference images with meta-data that will uniquely identify the template and science images and the relevant pipeline versions (S.3.5.1 of ls.st/dpdd). Alerts users will always be able to regenerate the alert contents at any later date using the included timestamps and metadata, along with the image data which will be accessible via the Rubin Science Platform.