This post is a bit late, as the relevant code was introduced almost 2 months ago, but it was brought to my attention that there is some confusion over which flags to use in order to obtain a catalog of unique sources. First I’ll give a bit of a backstory as to why this changed, followed by a description of the new flags and how they should be interpreted.
The input mergeDet
catalog to the deblender contains a list of parent sources, each row consisting of information about the parent, a footprint (a boolean mask of pixels in the input image that were detected as part of the parent blend), and a peak catalog (a list of peak locations and central flux values for all detected peaks in the parent blend).
Single-band deblender
Prior to the adoption of scarlet as the default deblender, all deblending was done with meas_deblender.SourceDeblendTask
, which is still the single visit (single band) deblender. The basic idea of the algorithm is that most galaxies roughly exhibit 180 degree symmetry and (at shallow depths) are only blended with one other object in most cases. So a symmetric template (numerically equivalent to x = np.min([x, x[::-1]])
) is made for all of the sources in a blend that do not fit the PSF, and the flux in the image is re-apportioned to each source based on the ratio of the templates for each pixel. This algorithm can fail for non-symmetric sources and for any blends where a source is blended with neighbors on both sides, causing inaccuracies in the measurement of all three sources. As the depth of the images increases blending becomes more severe and the instances where this algorithm fails increases, which is why a different deblender is used for co-added images in multiple bands.
Because the templates generated by the deblender are used to weight the flux from the image, the total flux in the footprints is conserved. This means that for isolated sources, the template that would be created by SourceDeblendTask
is irrelevant, as all of the flux in the footprint would be returned. The result is an output catalog (in each band) with all of the parent blends at the top followed by all of the children deblended from one of the parents. So when this was the only deblender, selecting a set of unique sources was easy, you just cut on deblend_nChild == 0
. This selected all of the isolated sources (from the parent section) and all of the deblended child sources.
Multi-band deblender (scarlet
)
meas_extensions_scarlet.ScarletDeblendTask
is different, as it uses scarlet
to create a model for each source in a blend. This is a philosophically different object, as there is no longer an assumption that all of the flux in the input image will be modeled by one of the children in the blend (this may change in the near future, but this is the current implementation). The results of the scarlet deblender will be biased by the assumptions that went into making the models, so it was decided that it would be a good idea to (by default) also model all of the isolated sources. This will allow comparisons of scarlet models of isolated sources to the un-modeled isolated source measurements to investigate the biases that scarlet is introducing and also gives users the option to choose between the un-modeled (parent) isolated source records and the scarlet model version of each isolated source. However this flexibility forced a change in the way that we select unique objects in a source catalog.
Flags set by the deblender
Before we get into the flags set by pipe_tasks it is useful to understand the flags that are set in SourceDeblendTask
and ScarletDeblendTask
that relate to source selection.
-
parent
: the id in the catalog for the parent of this source record. This is actually set pre-deblender, where all top level records haveparent=0
. -
deblend_nPeaks
: the number of peaks contained in the sources footprint. -
deblend_nChild
: the number of peaks deblended by the deblender from this source and created as new source records in the catalog. This is different fromdeblend_nPeaks
in that isolated sources that are not deblended bySourceDeblendTask
and child peaks that were culled during deblending are not included in this count. -
deblend_parentNPeaks
: The number of peaks contained in the parent of this source record. -
deblend_parentNChild
: the number of children deblended from the parent of this source record.
isPrimary
and other flags added in pipe_tasks
In addition to source records for deblended parents and multiple entries for isolated sources, output catalogs are also not unique because they may contain “pseduo” sources (eg. sky objects that have been added to assist with calibration but are not output sources) and, if the analysis is done over multiple patches and/or tracts, sources in the overlap region can exist in multiple overlapping patches (but always on the interior of only one). For this reason the pipe_tasks.SetPrimaryFlagsTask
sets a number of useful flags to assist users in determining a unique output catalog for their analysis.
detect_isPatchInner
and detect_isTractInner
True
when:
- A source is in the inner region of a patch
- A source is in the inner region of a tract
Details
The detect_isPatchInner
and detect_isTractInner
flags are used to identify sources that are contained in the interior region of a patches (and tracts). By definition every point in the sky is located on the interior of a patch and tract, however they also include an outer region that overlaps with neighboring patches/tracts. Sources with a False
value for either flag are included in the overlap region and will show up multiple times in a combined catalog. In practice it would be useful to have a more clever algorithm for choosing which source to use on the edge of a patch/tract, since some sources will be cutoff, however these flags give a quick way to ensure that a catalog using multiple tracts/patches is unique. So an easy way to get unique sources is to select all of the sources with detect_isPatchInner==True & detect_isTractInner==True
.
sky_source
and merge_peak_sky
True
when:
- A source is flagged as a
sky_source
in a single visit catalog
or
- A source is flagged as
merge_peak_sky
in amergeDet
coadd catalog.
Details
sky_source
is a flag in a single visit catalog to mark sky objects while merge_peak_sky
is the coadd version (which states that a source was a sky object in at least one band). Any sources with either of these flags set should be ignored in a final source catalog as they are not astrophysical objects.
detect_isIsolated
True
when:
- A source only has a single peak (
deblend_nPeaks == 1
) - A source is a top level parent (
parent == 0
) or its parent only had a single peak (deblend_parentNPeaks == 1
)
Details
The detect_isIsolated
flag marks sources that are not contained in a blend. This covers both isolated sources that are not modeled by the deblender (parents) and (in cases where the multi-band deblender is used) scarlet models of the isolated sources. Note that cutting on this flag will not give a unique set of sources, but can be useful for selecting all of the isolated sources to analyze the differences between measurements made on scarlet models and measurements made on the same isolated sources.
detect_fromBlend
True
when:
- A source is deblended from a parent that had multiple children (
deblend_parentNChild > 1
)
Details
The detect_fromBlend
flag is used to mark sources that were deblended from a parent that contained multiple children. This is not the opposite of detect_isIsolated
because it does not contain parents that were deblended into multiple sources.
detect_isDeblendedSource
True
when:
- The source is a top level parent and it is isolated
(detect_isIolated & parent==0)
or
- The source was deblended from a parent with multiple children and has no children of its own
(detect_fromBlend & deblend_nPeaks == 1)
Details
Current testing shows that the un-modeled isolated source measurements perform (perhaps unsurprisingly) better than the scarlet models of isolated sources in most cases, so the default set of unique sources uses the unmodeled (parent) isolated sources and scarlet models for sources in blends with multiple children. These sources are identified using the detect_isDeblendedSource
flag, which is equivalent to (detect_isIolated & parent==0) | (detect_fromBlend & deblend_nPeaks == 1)
. Checking that deblended sources only have a single peak in their footprints allows for potential hierarchical deblending in the future, where there may be several different hierarchies of deblended sources.
detect_isDeblendedModelSource
True
when:
- The source is not a top level parent (
parent != 0
) - The source does not have any children (
deblend_nPeaks == 1
)
Details
The detect_isDeblendedModelSource
flag only exists when the mutliband deblender is used, marking sources that were deblended from a parent. This includes both isolated sources that were modeled by scarlet and sources deblended from a parent with multiple child peaks. If your preference is to always use the scarlet model to ensure that the isolated and deblended sources have the same underlying models, then joining on detect_isDeblendedModelSource & detect_isPatchInner & detect_isTractInner & ~merge_sky_peak
will give a unique set of sources that is the equivalent of detect_isPrimary
, only using the scarlet isolated models as opposed to the un-modeled isolated source records.
detect_isPrimary
True
when:
- A source is located on the interior of a patch and tract (
detect_isPatchInner & detect_isTractInner
) - A source is not a sky object (
~merge_peak_sky
for coadds or~sky_source
for single visits) - A source is either an isolated parent that is un-modeled or deblended from a parent with multiple children (
isDeblendedSource
)
Details
The detect_isPrimary
flag can be thought of as a flag to include the most common catalog of unique sources that users will want to make measurements on. However it is advised that users understand the assumptions made in using sources marked with this flag and whether or not it suits their needs.