ApTemplate missing finalVisitSummary dataset type

ess20clsi · June 26, 2024, 8:35pm

Hi all,

I am currently trying to run the AP Pipeline on some HSC data, with the ultimate goal of eventually doing image differencing. When I try to run the ApTemplate.yaml pipeline, I am getting the following error:

lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback:
Traceback (most recent call last):
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/ctrl_mpexec/g218a3a8f53+ca4789321c/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 199, in run
if (qgraph := script.qgraph(pipelineObj=pipeline, **kwargs, show=show)) is None:
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/ctrl_mpexec/g218a3a8f53+ca4789321c/python/lsst/ctrl/mpexec/cli/script/qgraph.py", line 210, in qgraph
    qgraph = f.makeGraph(pipelineObj, args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/ctrl_mpexec/g218a3a8f53+ca4789321c/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 622, in makeGraph
    qgraph = graphBuilder.makeGraph(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/pipe_base/g8798d61f7d+6612571a14/python/lsst/pipe/base/graphBuilder.py", line 1840, in makeGraph
    return scaffolding.makeQuantumGraph(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/pipe_base/g8798d61f7d+6612571a14/python/lsst/pipe/base/graphBuilder.py", line 1651, in makeQuantumGraph
    registryDatasetTypes=self._get_registry_dataset_types(registry),
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/pipe_base/g8798d61f7d+6612571a14/python/lsst/pipe/base/graphBuilder.py", line 1682, in _get_registry_dataset_types
    raise MissingDatasetTypeError(f"Registry is missing an input dataset type {dstype}")
lsst.daf.butler.registry._exceptions.MissingDatasetTypeError: "Registry is missing an input dataset type DatasetType('finalVisitSummary', {band, instrument, physical_filter, visit}, ExposureCatalog)

This post from a few days ago suggests running updateVisitSummary before running makeWarp, but I don’t really know how that plays into the pipeline. I’ve tried running everything up to consolidateVisitSummary, followed by the updateVisitTask, with the same “No global calibration during nightly validation” configs as suggested in the other post, and I get the following:

lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback:
Traceback (most recent call last):
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/ctrl_mpexec/g218a3a8f53+ca4789321c/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 199, in run
    if (qgraph := script.qgraph(pipelineObj=pipeline, **kwargs, show=show)) is None:
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/ctrl_mpexec/g218a3a8f53+ca4789321c/python/lsst/ctrl/mpexec/cli/script/qgraph.py", line 210, in qgraph
    qgraph = f.makeGraph(pipelineObj, args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/ctrl_mpexec/g218a3a8f53+ca4789321c/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 622, in makeGraph
    qgraph = graphBuilder.makeGraph(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/pipe_base/g8798d61f7d+6612571a14/python/lsst/pipe/base/graphBuilder.py", line 1831, in makeGraph
    scaffolding.resolveDatasetRefs(
  File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-7.0.1/Linux64/pipe_base/g8798d61f7d+6612571a14/python/lsst/pipe/base/graphBuilder.py", line 1357, in resolveDatasetRefs
    raise RuntimeError(
RuntimeError: Dataset 'visitSummary_schema' (with no dimensions) could not be found in collections ('u/NH/processCCD/20240622T202946Z', 'HSC/raw/all', 'refcats/gen2', 'HSC/calib/run1', 'HSC/calib', 'u/NH/visitSummary/20240626T195758Z').

However the visitSummary_schema dataset definitely exists within the u/NH/visitSummary/20240626T195758Z collection. If anyone has any ideas on how to solve the issue I would really appreciate the help! For reference, I am running V26 of the software.

sfu · June 29, 2024, 1:30am

Hi, perhaps you can take a look at Lee’s notes. Those pipeline figures helped me understand how those tasks/data sets are connected. Merian Data Processing Using the LSST Science Pipelines - HackMD

rbliu · June 29, 2024, 4:52pm

Just a caveat about the Merian Survey Tutorial: there seems to be a missing updateVisitSummary in its step 2d – I had to add that task before running the following steps successfully.
The order of these tasks should be like:

Quanta            Tasks
------ ----------------------------
     2     finalizeCharacterization
     2           updateVisitSummary
    18 writeRecalibratedSourceTable
    18         transformSourceTable
     2       consolidateSourceTable

So, I guess you could try to rerun these tasks and make sure that updateVisitSummary looks good. For the visitSummary_schema error, I also have no idea. I got everything worked after adding the configs to the pipeline yaml file.

tasks:
  updateVisitSummary:
    class: lsst.drp.tasks.update_visit_summary.UpdateVisitSummaryTask
    config:
      wcs_provider: "input_summary"
      photo_calib_provider: "input_summary"
      background_provider: "input_summary"

galaxyumi331 · July 1, 2024, 5:39pm

Hi @ess20clsi,

Have you had a chance to try suggestions by @sfu and @rbliu to see if these resolve your problem?

ess20clsi · July 4, 2024, 7:17pm

Hi all,

Sorry for taking so long to get back to this. Based on the quantum graphs in Merian Data Processing Using the LSST Science Pipelines - HackMD and from @rbliu’s response, I tried adding the finalizeCharacterization task to my pipeline, as it looks like that output dataset is needed to run updateVisitSummary. Unfortunately now I get the following:

RuntimeError: 1 dataset(s) of type 'src_schema' was/were present in a previous query, but could not be found now. This is either a logic bug in QuantumGraph generation or the input collections have been modified since QuantumGraph generation began.

Does anyone have any other ideas?

kfindeisen · July 10, 2024, 9:50pm

Sorry for not seeing this sooner.

First, note that RFC-1018 deprecates ApTemplate in favor of the DRP pipelines, so you may wish to use those to make your templates. Most of the above discussion is about the DRP pipelines anyway, and does not apply to ApTemplate (which was written using different assumptions about what processing is needed).

If you do wish to continue with ApTemplate, the original bug you reported was fixed in w_2023_38, so I recommend using that version or a later weekly (it’s not fixed in 26.0.2, and release 27 is not quite ready yet).

If you can’t switch to a newer version, you might be able to patch your pipeline by setting makeWarp.config.doApplyFinalizedPsf to False. But there have been a lot of changes to the pipeline since release 26, so I can’t promise that that’s the only change needed.

ess20clsi · July 12, 2024, 6:33pm

Thank you @kfindeisen, that is very helpful. I don’t need to continue with APTemplate, in fact it has been causing me other issues and so a different pipeline sounds perfect.

I am trying to do difference imaging on some HSC data and the ap pipeline seemed like the best way to do that, but is there a different/better way? I am very new to the LSST science pipeline, and have been struggling to figure out which tasks I should run in order to get a difference image. So you have any suggestions?

kfindeisen · July 12, 2024, 7:21pm

The AP pipeline itself (ApPipe.yaml) is a perfectly fine way to do difference imaging, and should work* (we run it on HSC data on a regular basis). The problem is getting the templates it needs (by default, goodSeeing coadds) in the first place.

Unfortunately, I’ve never used the DRP pipelines myself, so I can’t offer any advice re: which of its steps are needed. @mrawls or @elhoward might be able to help you there.

*Looking ahead, the one catch for ApPipe is that its source association code assumes access to a scratch database (APDB). Depending on whether you care about creating DiaObjects or whether you just want difference images, you would need to either:

create an SQLite APDB as described in the ap_pipe docs [link is for release 26; recent builds use a different procedure]
exclude the diaPipe task from ApPipe, but ensure you run retrieveTemplate, subtractImages, and possibly detectAndMeasure

sgreenstreet · July 15, 2024, 9:47pm

Hi @ess20clsi - I wanted to follow-up to see if @kfindeisen’s response provided you with the solution you were looking for?

ess20clsi · July 15, 2024, 11:35pm

Thank you so much, @kfindeisen! I really appreciate all your help. Ideally, I wouldn’t run ALL of the DRP pipeline steps, as I don’t need many of the outputs it provides and I don’t want to spend time on steps I don’t need. As for ApPipe, I don’t need to create DiaObjects so I think it will be smooth sailing once I can create those templates.

If anyone knows what steps/tasks in the DRP pipeline are needed for template construction, I would really appreciate the advice. In the mean time @sgreenstreet we can go ahead a mark @kfindeisen 's earlier response as the solution.

parejkoj · July 18, 2024, 8:52pm

@laurenam, @yusra , @lee could all offer suggestions about what steps of DRP to run to make the coadds you need.

sgreenstreet · July 18, 2024, 9:10pm

Hi @ess20clsi - I also wanted to mention that DP0.2 tutorial notebook 03c demonstrates how to use the GetTemplateTask to create a single cutout image with contributions from multiple adjacent patches and tracts. In the LSST Science Pipelines, GetTemplateTask is used to create a template image from deepCoadd images for a given processed visit image, in order to perform Difference Image Analysis. This notebook may be a good resource for the steps needed for the template construction you’d like to do.