Data Model for variable sources at time of Data Releases?

gtrichards · February 20, 2018, 5:44pm

I’m reading through LSE-163 and trying to understand how the data for light curves of AGNs will be stored for the Data Release products.

Let’s say that I have a low-luminosity AGN that can be modeled as a point source embedded in the center of an extended source. For the light curve I think that what we will want are the ForcedSource.psDiffFlux measurements such that we are looking at the variability of the central engine rather than the full object (which shouldn’t be varying anyway).

The question then is: what is “the flux” of that object (if I wanted to recreate the [calibrated] light curve of the central engine)? Specfically, at the time of a DR will Object.psFlux be the same as the flux that one would measure from the “deep template” used to create the difference images?

jbosch · February 21, 2018, 3:47pm

I think you may have identified a small gap in LSE-163, but I think it’ll be an easy one to plug.

Ideally, as you suggested, Object.psFlux (and all of the other flux measures in the Object table) would correspond to some consistent definition of the “DC” sky that could be combined with an “AC” flux measured on a difference image to yield total fluxes at the epoch of the difference image. But in general, the Object fluxes will be measured with coadds or multi-epoch fitting algorithms that utilize a different set of input images than the template coadds used to produce the difference images.

I think that means we’ll just have to report another set of Object fluxes that is guaranteed to be consistent with the template coadd flux. These fluxes would probably be slightly lower S/N than the existing Object fluxes (since we can’t optimize the input images and weights to maximize the S/N), and it probably wouldn’t be the full suite of Object fluxes; probably just PSF and aperture photometry. I’ve just created DM-13607 to remind us to add these to the DPDD.

I think your question also gets at how we’d separate the low-luminosity AGN from the extended source in which it’s embedded. As you said, forced photometry on difference images doesn’t have that problem, but any way to measure the DC flux would. That means it’s up to how well we can deblend that pair, and that will depend a lot on the relative fluxes of the components and how extended the host galaxy is. It also requires that we first know that there are in fact two components. If the AGN variability and flux are such that it’s detected on difference images and the host galaxy is clearly extended, that should be enough to guarantee that we deblend and measure two components. There will probably be marginal cases where we don’t identify the components as distinct, but it would still be possible to detect the presence of the AGN by looking for statistical variability in the forced photometry; we currently have no plans to do that as part of the data release processing, but it’d be a good candidate for user-direct image processing in the Science Platform.

ivezic · February 23, 2018, 11:09pm

Before I might start editing DPDD, I’d like to clarify:

are we talking here about Object or DIAObject?
assuming DIAObject, which would make more sense to me, we what
fpFluxMean in DIAObject table, which is the mean DC level
(btw, fpFluxMean is the mean of totFlux from DIASource table, and
will be renamed to totFluxMean in the next version of DPDD)
if Object, we have psFlux in Object table
Gordon, can you please clarify?

gtrichards · February 26, 2018, 7:58pm

Hi Zeljko — My interest right now is how to plan for LSST in the context of the annual Data Releases (and not “Prompt Products”) given that AGN SC users may naively consider LSST as similar to SDSS. However, that’s not quite the case as some of the data products are very different. It is important for us to understand that now both in order to be prepared for LSST data when it comes, but also to build training sets that “look like” LSST data to the extent possible.

This is really just the first of what is potentially a series of related questions that can be generically described as “What SQL code do I need in order to extract all of the relevant ‘attributes’ for application of a Level-3 multi-dimensional AGN classification algorithm”?

Let’s say that one attribute is the light curve for a potential AGN candidate at the time of DR1. To me the following describes what seems like the optimal choice for the light curve: “Forced photometry will be performed on all visits, for all Objects, using both direct images and difference images. The measured fluxes will be stored in the ForcedSources table. Due to space constraints, we only plan to measure the PSF flux.”

Specifically, I’d treat ForcedSource.psDiffFlux (Table 6 from LSE-163) as “the light curve”. But then if I want to know something about the light curve relative to the flux of the object (and not just the differential flux), I need to decide what measurement is going to represent “the flux”. I don’t think that any parameter in the LSE-163 tables is quite right (e.g., the flux of the object in the deep template after a host galaxy component has been deblended out), but it seems to me that the closest thing is Object.psFlux (Table 4 in LSE-163). So, I’m asking if I am interpreting the data model correctly or if there is a better way to derive “the light curve” from the Data Release data model?

jbosch · February 27, 2018, 12:18pm

(Edited: fpFluxMean is a mean of difference image forced photometry, not a mean of direct forced photometry, so what I wrote previously was incorrect.)

Looking more closely at how pfFluxMean is defined, I think @ivezic is right that fpFluxMean would provide the offset between Object.psFlux and the template PSF flux I was proposing, as long as the epochs that go into that mean are the same as the epochs that are used to measure Object.psFlux (which should probably be the case in DRP). I think that means the only action on the DPDD is to document that it can be used for that purpose, and make sure that it is a mean of difference image forced photometry in DRP (this is clear in the Prompt case, where the only forced photometry is on difference images, but in DRP our current plan is to produce both).

MelissaGraham · July 2, 2018, 4:54pm

The purpose of this post is to hopefully clarify, with references, the various LSST data products that are relevant to this discussion, outline the cases where User Generated data products might be necessary, and identify which components of the LSST pipelines might be used to create them. The following paragraphs have been written with the novice LSST user in mind.

To reiterate, the question is how to recreate the calibrated light curve of the central engine of a low-luminosity AGN that is embedded in a host galaxy. In other words, which LSST data products deliver the flux of the AGN (quiescent + variable components) as a function of time, with no contamination from the extended host galaxy.

First, to summarize the most relevant parameters in the difference imaging analysis (DIA) catalogs DIASource and DIAObject, which are measured from the single visit and difference images, and explain why they are inadequate for this goal. From Table 1 in Section 3.3.1 of the Data Products Definitions Document (DPDD; ls.st/lse-163):

DIASource.psFlux = calibrated flux for a point source model; measures flux difference between the template and the visit image (this would be the variable component of the AGN only), and

DIASource.totFlux = calibrated flux for point source model measured on the visit image centered at the centroid measured on the difference image (this would combine the underlying host and the quiescent and variable components of the AGN).

These two data products cannot yeild the quiescent + variable component of the AGN only. To get this would require that the DIA pipeline deblend the quiescent point source and extended host components in the template image using the point source model from the detected variable component, and store this information with the DIASource. This type of processing is not included in the DIA pipeline.

Second, to summarize the relevant parameters in the Data Release Object and Source catalogs (Tables 4 and 5 of the DPDD), which are measured from the single visit and the deep coadd images (note that the deep coadd images are not the same as the template images). As described in S.4.1 of the DPDD, “The master list of Objects in the Data Release will be generated by associating and deblending the list of single-epoch DIASource detections and the lists of sources detected on coadds”. This means that the AGN detected in a difference image will also be in the Object catalog.

Furthermore, “…to enable studies of variability, the fluxes of all Objects will be measured on individual visits (using both direct and difference images), with their shape parameters and deblending resolutions kept constant. This process is known as forced photometry” (DPDD, S.4.1). Forced photometry is described in S.4.2.4 of the DPDD as: “the measurement of flux in individual visits, given a fixed position, shape, and the deblending parameters of an object. It enables the study of time variability of an object’s flux, irrespective of whether the flux in any given individual visit is above or below the single-visit detection threshold.” For an AGN light curve, we should look to the parameters in the ForcedSource catalog.

As Gordon identified in his Community post on this thread on Feb 26,

ForcedSource.psDiffFlux = point source model flux on difference image

should be equivalent to the DIASource.psFlux of the AGN, so long as two conditions are met: (1) the same template image was used and (2) the centroid determined from the coadd and used for forced photometry is the same as the centroid for the difference image. The first condition should be true for a given data release, but the second condition may not be true for AGN components with low signal-to-noise ratio. Although ForcedSource.psfDiffFlux conceptually measures the same quantity as DIASource.psFlux, it is potentially much better for non-moving Objects because it uses detections and positions obtained from all epochs, not just the measurement epoch.

However, as discussed in the first few paragraphs, these measurements from the difference image are insufficient for the goal of obtaining the AGN flux. Instead, we turn to:

ForcedSource.psFlux = point source model flux on direct image.

Since the point source model is the deblended child, this is the flux of the AGN component only, without the host galaxy contribution, in a single-visit image. Therefore, this is the one to use to create the light curve of the AGN – but there are caveats for our particular situation of a low-luminosity AGN.

As Jim Bosch notes in his Community post on this thread from Feb 21, “it’s up to how well we can deblend that pair, and that will depend a lot on the relative fluxes of the components and how extended the host galaxy is”. It is also important to note that the deblending is done - and the point source model defined - based on the coadd image, which might not provide the optimal results for low-luminosity AGN. Depending on the science goals, the conclusion is that the ForcedSource data products alone might not be sufficient for studies of embedded AGN below some luminosity ratio threshold. All or some low-luminosity AGN may require a User-Generated pipeline to, e.g., stack all images from epochs when the AGN was brightest, redefine the point source model, rerun the deblender, and make new measurements of ForcedSource.psFlux.