It’s not immediately obvious from looking at it, but my diagram describing DRP processing flow implies that there’s no real distinction between DIAObjects and Objects in Data Release Production. This caused some confused at today’s SPDWG meeting, so I’ll try to lay out my vision a little more concretely.
In Nightly processing, we associate DIASources spatially to produce DIAObjects, then aggregate DIASource measurements to fill in the DIAObject table.
We could do the same in DRP, but I argue that it’s better to defer associating the DIASources with each other, and instead just wait until we’ve run detection on coadds and then associate all DIASources and coadd detections at the same time. Having access to the coadd detections (and possibly coadd pixel data) during association will give us a lot more information about how to resolve ambiguous matching due to blends. We can also do anything we could have done in associating DIASources with each other (such as reject spurious objects that have only had one detection over a long time period). This association procedure just produces Objects, which would then be a single consistent catalog that attempts to explain the represent everything in the sky but solar system objects.
We’ll need to trace the origin of all of these Objects (even non-DIASource Objects need to record which kinds of coadds they were detected in), and in the case of DIASource-derived or partially-DIASource-derived Objects it’s straightforward to both set bits in the Object table and add an Object ID column to the DIASource table (a DIASource will be associated with exactly one Object, unless it is declared to be spurious) to link them up.
Since we’re planning to do forced photometry on all Objects, DIASource-derived Objects will automatically get any forced photometry that they would have had as DIAObjects (but at better positions), and when Nightly processing queries the DR database for Objects near a new nightly DIASource, there’s no need for any additional spatial matches to find the DR DIASources associated with that Object.
That still leaves the question of what to do with the columns in the DIAObject table in DR. If we want to keep them, this is still straightforward - we can just have a DIAObject table with (conceptually) the same IDs as the Object table, but a subset of the Object table’s records, containing aggregate quantities computed from the DR DIASources associated with that Object. But I’m not sure we actually need to keep them, at least not beyond diagnostic/QA use; every quantity in the DIAObject table currently described by the DPDD is also present (but measured differently) in the Object table. I’d argue that the Object table measurements should be consistently higher-quality than their DIAObject counterparts, generally because they utilize more information:
- The DIAObject positions, proper motion, parallax, and PSF flux parameters (defined by an aggregate over DIASource measurements) will be better measured by the multifit Moving Point Source model fit results in the Object table.
- The lcPeriodic and lcNonPeriodic quantities in DIAObject are also present in Object, where they are determined from forced photometry rather than than independent DIASource measurements.
Overall, I’d say this puts the DIAObject measurements in the same position as Source: they’ll probably be superseded by other measurements, but we may not want to drop them until we can demonstrate that this is the case. In particular, I’m worried about whether multifit point source model fitting will actually be better in extremely crowded regions.
The biggest concern I have with this proposal is that it implies that we’ll run the full suite of Object measurements even for pure transients like supernovae, and that could be both wasteful (for e.g. multifit bulge+disk fits) and poorly-defined (for measurements on coadds). I don’t actually know if this is a new problem or just an unappreciated one that already existed; I think the current DPDD is quite vague about how/if DIASources can generate Objects at present. Obviously, having too many false detections in difference imaging analysis would make this an even more serious concern.
In any case, I think it’s straightforward to solve this by vertically partitioning the Object table to separate galaxy- or coadd-focused measurements and adding some language to the DPDD about when we decide we don’t need to do those measurements. We could almost certainly use the same criteria we’ll use to determine whether to mask a DIASource (because it’s transient) from the coadds or average it (because it’s variable), so it’s not like we’d have to make a scientific choice we wouldn’t have already had to make. And in no case would we be opting not to do galaxy- or coadd- measurements on stars.