How to run ProcessCcdTask via runDataRef

danjampro · May 4, 2021, 1:16am

Hello,
I would like to run ProcessCcdTask for a single dataId obtained through a Butler instance. I do not want to use parseAndRun or parse any arguments as if I were using the command line. A naive attempt looks like this:

ref = butler.dataRef(datasetType="raw", dataId=dataId)
task.runDataRef(ref)

but this does not work as the code is unable to find the required calibs (note that it works fine using the command line task parseAndRun approach). Can anyone advise?

Many thanks,
Dan Prole,
Postdoctoral Researcher, Macquarie University

price · May 4, 2021, 5:28pm

You need to instantiate the butler with the path to the calibrations repo as a parameter. E.g.:

butler = Butler("/path/to/data", calibRoot="/path/to/calibs")

danjampro · May 11, 2021, 5:05am

Hi Paul,

Thanks for your response. This has led me to another error:

from lsst.pipe.tasks.processCcd import ProcessCcdTask 
task = ProcessCcdTask()         
task.runDataRef(ref)

which results in this error:

AttributeError: 'HuntsmanMapper' object has no attribute 'map_defects'

This leads me to suspect that the config has not been loaded properly since the ISR config file contains doDefect = False. This confuses me since it works fine when using parseAndRun without explicitly providing the config file.

I could explicitly load the config from file and parse it as an arg when initialising the task, but I would expect the task to be able to do that itself. Am I missing something?

Thanks.

danjampro · May 11, 2021, 5:15am

Hi again,

I believe I have answered my own question, although I do not particularly like the answer!

It seems the obs package config overrides are applied by the ArgumentParser:

github.com

lsst/pipe_base/blob/1f131b1d31ce7052cc30babe737b70489cb85257/python/lsst/pipe/base/argumentParser.py#L810


            try:
                dataIdContainer.castDataIds(butler=namespace.butler)
            except (KeyError, TypeError) as e:
                # failure of castDataIds indicates invalid command args
                self.error(e)

            # failure of makeDataRefList indicates a bug
            # that wants a traceback
            dataIdContainer.makeDataRefList(namespace)

def _applyInitialOverrides(self, namespace):
    """Apply obs-package-specific and camera-specific config
    override files, if found

    Parameters
    ----------
    namespace : `argparse.Namespace`
        Parsed namespace. These attributes are read:

        - ``obsPkg``

This basically means one is forced into either applying config overrides manually or by using the CmdLineTask paradigm, making it quite awkward to use the underlying task functionality. Is there an alternative?

ktl · May 11, 2021, 5:22am

You are correct that the soon-to-be-deprecated Gen2 middleware implements obs_ package overrides in the ArgumentParser so calling the underlying task directly requires manual synthesis of the appropriate configuration.

I’ll let someone else describe how this changes (or doesn’t) in Gen3.

timj · June 24, 2021, 10:28pm

In gen3 instrument config overrides are applied by the instrument class and can be overridden by subclasses:

github.com

lsst/obs_base/blob/master/python/lsst/obs/base/_instrument.py#L332-L346

    
      
          def applyConfigOverrides(self, name, config):
              """Apply instrument-specific overrides for a task config.
          
          
    Parameters
              ----------
              name : `str`
                  Name of the object being configured; typically the _DefaultName
                  of a Task.
              config : `lsst.pex.config.Config`
                  Config instance to which overrides should be applied.
              """
              for root in self.configPaths:
                  path = os.path.join(root, f"{name}.py")
                  if os.path.exists(path):
                      config.load(path)

Gen3 changes pipeline execution completely such that we now define pipelines in YAML and have standard executors and no longer have command line tasks. Gen3 also has a completely rewritten butler and obs packages are now more standardized. New instruments are very easy to add and the main task is to write a metadata translator (as defined by GitHub - lsst/astro_metadata_translator: Observation metadata handling infrastructure).

danjampro · June 24, 2021, 11:59pm

Hi @timj, thanks for the information. I look forward to using gen3! I have two questions:

Where can I find more information / documentation on the gen3 changes?
When will gen3 be officially rolled out?

Thanks!

timj · June 25, 2021, 3:08am

It depends on what you mean by “official”. The v22 release coming out this week or next is going to be pretty solid but if you are using gen3 for real you are better off using whatever the newest weekly happens to be (w_2021_26 at the moment).

We are working on the pipelines.lsst.io tutorials and may even have something out next week. The data preview documentation will also cover the gen3 system extensively and should be published next week.

There are also community documents such as the one here.