Meeting happened at 11:30am Pacific on 2/16/2016.
Present: @davidciardi, @xiuqin, @gpdf, @ktl, @mjuric, @jbosch, @hsinfang, @rhl, @ksk, @jdswinbank (and maybe others?)
Summary (from @jbosch’s memory; please augment if anything important is missing):
-
DPDD states that we’ll do “Level 1 reprocessing” during each data release, and use this to replace the Level 1 database with results from using updates algorithms, templates, calibrations, etc. This reprocessing should include all observations up to the actual data release (more specifically, the Level 1 database replacement point), and it should use the same algorithms and pre-built inputs that will be used in Level 1 nightly processing that happens after this point.
-
The moving object database also needs to be refreshed at each data release. Given that this is also updated daily during normal Level 1 operations, it’s not clear whether there’s a meaningful distinction between this database and the rest of the Level 1 database aside from the fact that the moving objects can’t be localized to a particular part of the sky.
-
We also need to do image subtraction, DiaSource detection, DiaObject generation, and MOPS during the main Level 2 processing in slightly different ways. We’ll want to leverage the same calibration and low-level processing done for the rest of the Level 2, probably make more measurements on DiaObjects and DiaSources than we would nightly, and do a better job of associate DiaObjects with with other kinds of Objects. We’ll also need to use this image subtraction to find artifacts to mask when building coadds. The inputs to this processing must be just those identified at the start of DRP processing, because it needs to mostly happen in an ordered determined spatially, not temporally.
-
We have quite a bit of flexibility in choosing input cutoff dates and database refresh dates, and don’t need to make them the same to the extend implied by the DPDD if that helps resolve problems. Not obvious that it would.
-
We need to build coadds as templates before we can do image subtraction in Level 2, but we also need image subtraction to generate masks before we can build coadds. Proposal on the table is to do PSF-matched coaddition with outlier rejection prior to image subtraction, then do other forms of coaddition (and multifit) after image subtraction.
-
From a scientific standpoint, the Level 2 image differencing (etc) data products should be at least as good as the reprocessed Level 1 data products for observations before the DRP input cutoff date, but it’s not clear whether we can just use them to update the Level 1 database, because they’ll have different schemas and algorithmic provenance, making them harder to compare to subsequent measurements generated with Level 1 algorithms. It may be better to just run image subtraction twice on all images before the DRP cutoff date (once with Level 2 versions of the algorithms in spatial order, once with Level 1 versions in temporal order), and provide database functionality to join the two databases.
-
Any Level 1 reprocessing during a data release cannot start before the template generation and MOPS stages have completed in the Level 2 processing. Because MOPS in Level 2 is already expected to be a full-sky sequence point, this doesn’t introduce any new limits on the Level 2 processing, but it does mean that Level 1 reprocessing cannot be continuous throughout the year. New concern as I write up these notes: if Level 1 reprocessing depends on completing full-sky deep detection first (to generate regular Objects with which to associate), this would represent a new full-sky sequence point in Level 2 processing.
-
It may be possible to devise a Level 1 database in which updates at least appear to be continuous to the user. This is a database engineering challenge.
-
We need to put together a sequence diagram including all important dates for a data release, at a level of detail sufficient to include these dependencies.
-
Ownership is unclear for both the Level 2 variants of Level 1 algorithms and the Level 1 reprocessing during a data release. We need to make sure these don’t fall through the cracks.