Recent changes to science pipelines logging

In the past week or so there have been some major changes to how we deal with logging in the science pipelines software.

pipe.base.Task Logging

Following RFC-782 the logger directly attached to a Task is now a Python logger and no longer a lsst.log.Log logger. The actual class that is used is derived from a logging.LoggerAdapter and has come compatibility methods available. In particular:

  • log.trace() can be used as with lsst.log.
  • A new method log.verbose() has been added that can issue log messages between DEBUG and INFO. This can be used when you want to give more information without swamping a user with debugging information. (see RFC-245 for more discussion on this).
  • The debugging constants are available so task.log.INFO still works as it did before.
  • There is a context manager attached to a logger (temporary_log_level) that allows the level to be temporarily modified. This should be used instead of lsst.log.utils.temporaryLogLevel.

This transition has come with some deprecations and you may well see additional warnings coming out of the code: log.getName(), log.getLevel() should now use the normal .name and .level properties (these were added to lsst.log.Log last week to simplify the transition). log.isDebugEnabled should be replaced by log.isEnabledFor. Python logging deprecated log.warn() and so all code should transition to log.warning(). We also now deprecate the infof(), warningf() and related methods.

One immediately visible effect of this change is that log messages are now captured by pytest automatically and reported separately from standard error and standard output.

Gen2 Log Output

Even though Task now uses python logging, CmdLineTask is still configured to forward all python log messages to lsst.log. Gen2 packages such as daf_persistence and the derived Mapper classes are still using lsst.log and there are no plans to touch those.

Adding logging to python code

For anything not related to Gen2, the guidance is to now use Python logging by default in python code. If you want to make use of trace and verbose outside of a Task you can use lsst.pipe.base.getTaskLogger() to create the logger.

Gen 3 log output

For butler and pipetask command lines all log output is now handled by Python logging.

There are two new command line options:

  • --log-file can be used to record all log messages to a file. If the file has extension .json the log file will contain JSON records, else it will contain formatted log strings. This option can be used multiple times. The log strings will match whatever format was being used and respect --long-log whereas the JSON records will contain all the information.
  • --no-log-tty can be used to disable log messages sent to the terminal. This is particularly useful if you are sending output to a separate file (and you can always use tail -f on that file to see what’s going on).

One major new change is that pipetask run now records log output from each quantum in the butler itself along with task metadata. For example from pipelines_check output:

>>> from lsst.daf.butler import Butler
>>> butler = Butler("DATA_REPO")
>>> log = butler.get("isr_log", instrument="HSC", detector="1_36", exposure=903342, collections="demo_collection")
>>> type(log)
<class 'lsst.daf.butler.core.logging.ButlerLogRecords'>
>>> print(log[:5])
WARNING 2021-07-28T16:40:22.463971+00:00 afw.image.MaskedImageFitsReader MaskedImageFitsReader.cc:94 - Expected extension type not found: IMAGE
WARNING 2021-07-28T16:40:24.793796+00:00 afw.image.MaskedImageFitsReader MaskedImageFitsReader.cc:94 - Expected extension type not found: IMAGE
WARNING 2021-07-28T16:40:24.929224+00:00 isr isrTask.py:1003 - Constructing linearizer from cameraGeom information.
DEBUG 2021-07-28T16:40:24.929631+00:00 timer.isr timer.py:73 - runStartCpuTime=8.120303; runStartUserTime=5.977052; runStartSystemTime=2.143258; runStartMaxResidentSetSize=781852672; runStartMinorPageFaults=193407; runStartMajorPageFaults=1910; runStartBlockInputs=0; runStartBlockOutputs=0; runStartVoluntaryContextSwitches=3050; runStartInvoluntaryContextSwitches=5157
INFO 2021-07-28T16:40:24.929917+00:00 isr isrTask.py:1410 - Converting exposure to floating point values.

The lsst.daf.butler.ButlerLogRecords class acts like a collection of log records and can be indexed. The format string can also be changed in order to display different information when the records are printed.

We don’t provide any methods for filtering the records once they have been retrieved although it’s relatively straightforward to implement your own filter:

>>> filtered = type(log).from_records([r for r in log if r.levelname == "DEBUG"])
>>> len(filtered)
4
>>> print(filtered)
DEBUG 2021-07-28T16:40:24.929631+00:00 timer.isr timer.py:73 - runStartCpuTime=8.120303; runStartUserTime=5.977052; runStartSystemTime=2.143258; runStartMaxResidentSetSize=781852672; runStartMinorPageFaults=193407; runStartMajorPageFaults=1910; runStartBlockInputs=0; runStartBlockOutputs=0; runStartVoluntaryContextSwitches=3050; runStartInvoluntaryContextSwitches=5157
DEBUG 2021-07-28T16:40:36.993596+00:00 timer.isr.fringe timer.py:73 - runStartCpuTime=20.013078; runStartUserTime=17.349074; runStartSystemTime=2.6640129999999997; runStartMaxResidentSetSize=2110672896; runStartMinorPageFaults=517525; runStartMajorPageFaults=2218; runStartBlockInputs=0; runStartBlockOutputs=0; runStartVoluntaryContextSwitches=3053; runStartInvoluntaryContextSwitches=14394
DEBUG 2021-07-28T16:40:36.994521+00:00 timer.isr.fringe timer.py:73 - runEndCpuTime=20.036667; runEndUserTime=17.369633; runEndSystemTime=2.667043; runEndMaxResidentSetSize=2110672896; runEndMinorPageFaults=517525; runEndMajorPageFaults=2218; runEndBlockInputs=0; runEndBlockOutputs=0; runEndVoluntaryContextSwitches=3053; runEndInvoluntaryContextSwitches=14459
DEBUG 2021-07-28T16:40:38.225491+00:00 timer.isr timer.py:73 - runEndCpuTime=21.259907; runEndUserTime=18.539821; runEndSystemTime=2.720103; runEndMaxResidentSetSize=2132619264; runEndMinorPageFaults=522835; runEndMajorPageFaults=2267; runEndBlockInputs=0; runEndBlockOutputs=0; runEndVoluntaryContextSwitches=3053; runEndInvoluntaryContextSwitches=15114

If DEBUG is enabled the logs can get very large. For example in characterizeImage turning on DEBUG results in half a million log records and about 500MB. These can be retrieved although if this becomes common we may need to add filtering parameters to the butler.get() call.

The expectation is that we will start using these new features to harvest and store logs in a centralized LogStash-like environment to allow easy search and display.

1 Like

One quick addendum…

The first time you call pipetask after you get this change, you may need to add a --register-dataset-types option. This is because the new *_log dataset types will not exist yet.

Thanks Tim. I am nearing the end of my 2nd pipeline step through with V22. I plan to repeat again with V22 and the Gen3 amendments. Your log enhancements are appealing to me, because I’d like to run the pipeline steps, review logging info at detailed levels, and THEN be able to conveniently “find” the java/python function/class/script/method code the explore what the code looks like. So, my hopes are that the enhancements to logging embraces this capability. I’m getting better at this process (ratcheting up my java skills), but hoping your enhanced logging helps along the way. Cheers. Fred, Dallas.

This can make debugging tasks in tests difficult, as you don’t see the logs while e.g. stepping through a debugger. It looks like you you can specify --log-cli-level=INFO (or your preferred log level) to get immediate log output to the console when running interactive tests.

Another addition: If you are using lsst logging outside of the PipelineTask framework (e.g. in a stand-alone commandline script) you will need to manually initialize the logging environment in order to get any logs to appear on stdout. This config will get you output that looks like the LSST “short” log format. Put the basicConfig line before you run your code:

import logging
logging.basicConfig(level=logging.INFO, format="{name} {levelname}: {message}", style="{")

I think you might need a stream=sys.stdout to make the logs turn up on standard out. I think the default is stderr.