This post is a bit late, as the relevant code was introduced almost 2 months ago, but it was brought to my attention that there is some confusion over which flags to use in order to obtain a catalog of unique sources. First I’ll give a bit of a backstory as to why this changed, followed by a description of the new flags and how they should be interpreted.
mergeDet catalog to the deblender contains a list of parent sources, each row consisting of information about the parent, a footprint (a boolean mask of pixels in the input image that were detected as part of the parent blend), and a peak catalog (a list of peak locations and central flux values for all detected peaks in the parent blend).
Prior to the adoption of scarlet as the default deblender, all deblending was done with
meas_deblender.SourceDeblendTask, which is still the single visit (single band) deblender. The basic idea of the algorithm is that most galaxies roughly exhibit 180 degree symmetry and (at shallow depths) are only blended with one other object in most cases. So a symmetric template (numerically equivalent to
x = np.min([x, x[::-1]])) is made for all of the sources in a blend that do not fit the PSF, and the flux in the image is re-apportioned to each source based on the ratio of the templates for each pixel. This algorithm can fail for non-symmetric sources and for any blends where a source is blended with neighbors on both sides, causing inaccuracies in the measurement of all three sources. As the depth of the images increases blending becomes more severe and the instances where this algorithm fails increases, which is why a different deblender is used for co-added images in multiple bands.
Because the templates generated by the deblender are used to weight the flux from the image, the total flux in the footprints is conserved. This means that for isolated sources, the template that would be created by
SourceDeblendTask is irrelevant, as all of the flux in the footprint would be returned. The result is an output catalog (in each band) with all of the parent blends at the top followed by all of the children deblended from one of the parents. So when this was the only deblender, selecting a set of unique sources was easy, you just cut on
deblend_nChild == 0. This selected all of the isolated sources (from the parent section) and all of the deblended child sources.
Multi-band deblender (
meas_extensions_scarlet.ScarletDeblendTask is different, as it uses
scarlet to create a model for each source in a blend. This is a philosophically different object, as there is no longer an assumption that all of the flux in the input image will be modeled by one of the children in the blend (this may change in the near future, but this is the current implementation). The results of the scarlet deblender will be biased by the assumptions that went into making the models, so it was decided that it would be a good idea to (by default) also model all of the isolated sources. This will allow comparisons of scarlet models of isolated sources to the un-modeled isolated source measurements to investigate the biases that scarlet is introducing and also gives users the option to choose between the un-modeled (parent) isolated source records and the scarlet model version of each isolated source. However this flexibility forced a change in the way that we select unique objects in a source catalog.
Flags set by the deblender
Before we get into the flags set by pipe_tasks it is useful to understand the flags that are set in
ScarletDeblendTask that relate to source selection.
parent: the id in the catalog for the parent of this source record. This is actually set pre-deblender, where all top level records have
deblend_nPeaks: the number of peaks contained in the sources footprint.
deblend_nChild: the number of peaks deblended by the deblender from this source and created as new source records in the catalog. This is different from
deblend_nPeaksin that isolated sources that are not deblended by
SourceDeblendTaskand child peaks that were culled during deblending are not included in this count.
deblend_parentNPeaks: The number of peaks contained in the parent of this source record.
deblend_parentNChild: the number of children deblended from the parent of this source record.
isPrimary and other flags added in
In addition to source records for deblended parents and multiple entries for isolated sources, output catalogs are also not unique because they may contain “pseduo” sources (eg. sky objects that have been added to assist with calibration but are not output sources) and, if the analysis is done over multiple patches and/or tracts, sources in the overlap region can exist in multiple overlapping patches (but always on the interior of only one). For this reason the
pipe_tasks.SetPrimaryFlagsTask sets a number of useful flags to assist users in determining a unique output catalog for their analysis.
- A source is in the inner region of a patch
- A source is in the inner region of a tract
detect_isTractInner flags are used to identify sources that are contained in the interior region of a patches (and tracts). By definition every point in the sky is located on the interior of a patch and tract, however they also include an outer region that overlaps with neighboring patches/tracts. Sources with a
False value for either flag are included in the overlap region and will show up multiple times in a combined catalog. In practice it would be useful to have a more clever algorithm for choosing which source to use on the edge of a patch/tract, since some sources will be cutoff, however these flags give a quick way to ensure that a catalog using multiple tracts/patches is unique. So an easy way to get unique sources is to select all of the sources with
detect_isPatchInner==True & detect_isTractInner==True.
- A source is flagged as a
sky_sourcein a single visit catalog
- A source is flagged as
sky_source is a flag in a single visit catalog to mark sky objects while
merge_peak_sky is the coadd version (which states that a source was a sky object in at least one band). Any sources with either of these flags set should be ignored in a final source catalog as they are not astrophysical objects.
- A source only has a single peak (
deblend_nPeaks == 1)
- A source is a top level parent (
parent == 0) or its parent only had a single peak (
deblend_parentNPeaks == 1)
detect_isIsolated flag marks sources that are not contained in a blend. This covers both isolated sources that are not modeled by the deblender (parents) and (in cases where the multi-band deblender is used) scarlet models of the isolated sources. Note that cutting on this flag will not give a unique set of sources, but can be useful for selecting all of the isolated sources to analyze the differences between measurements made on scarlet models and measurements made on the same isolated sources.
- A source is deblended from a parent that had multiple children (
deblend_parentNChild > 1)
detect_fromBlend flag is used to mark sources that were deblended from a parent that contained multiple children. This is not the opposite of
detect_isIsolated because it does not contain parents that were deblended into multiple sources.
- The source is a top level parent and it is isolated
(detect_isIolated & parent==0)
- The source was deblended from a parent with multiple children and has no children of its own
(detect_fromBlend & deblend_nPeaks == 1)
Current testing shows that the un-modeled isolated source measurements perform (perhaps unsurprisingly) better than the scarlet models of isolated sources in most cases, so the default set of unique sources uses the unmodeled (parent) isolated sources and scarlet models for sources in blends with multiple children. These sources are identified using the
detect_isDeblendedSource flag, which is equivalent to
(detect_isIolated & parent==0) | (detect_fromBlend & deblend_nPeaks == 1). Checking that deblended sources only have a single peak in their footprints allows for potential hierarchical deblending in the future, where there may be several different hierarchies of deblended sources.
- The source is not a top level parent (
parent != 0)
- The source does not have any children (
deblend_nPeaks == 1)
detect_isDeblendedModelSource flag only exists when the mutliband deblender is used, marking sources that were deblended from a parent. This includes both isolated sources that were modeled by scarlet and sources deblended from a parent with multiple child peaks. If your preference is to always use the scarlet model to ensure that the isolated and deblended sources have the same underlying models, then joining on
detect_isDeblendedModelSource & detect_isPatchInner & detect_isTractInner & ~merge_sky_peak will give a unique set of sources that is the equivalent of
detect_isPrimary, only using the scarlet isolated models as opposed to the un-modeled isolated source records.
- A source is located on the interior of a patch and tract (
detect_isPatchInner & detect_isTractInner)
- A source is not a sky object (
~merge_peak_skyfor coadds or
~sky_sourcefor single visits)
- A source is either an isolated parent that is un-modeled or deblended from a parent with multiple children (
detect_isPrimary flag can be thought of as a flag to include the most common catalog of unique sources that users will want to make measurements on. However it is advised that users understand the assumptions made in using sources marked with this flag and whether or not it suits their needs.