Pipetask command line interface changes

Few recent commits to ctrl_mpexec which are now on the master branch and will appear in the today’s weekly build introduce incompatible changes to the command line interface of the pipetask application.

For details you can check JIRA tickets DM-21421, DM-21889, and DM-21890 ; here I’m going to summarize all those changes:

  • The ordering of sub-commands and various options has changed, now sub-commands have to be the first argument after pipetask and all optional arguments follow the sub-command.
  • The list sub-command has been temporarily removed, will be re-implemented later based on different tooling.
  • Standard way to define pipelines is via YAML files, format will be documented later, for now one can look at the examples in pipe_tasks package. It is still possible to use old command line options -t/-c/-C to build pipelines, but YAML should provide better interface for that. The -p/--pipeline option to build/qgraph/run sub-commands now takes a YAML definition file.
  • Global option -p specifying packages for task search had disappeared entirely, tasks should be specified using their fully-qualified names (e.g. lsst.pipe.tasks.calibrate.CalibrateTask)
  • Options -m/--move and -l/--label which modified pre-built pipelines have been removed.
  • New --instrument option has been added which specifies fully-qualified class name for the instrument, this is used for instrument-specific configuration overrides.
  • Few single-letter options have been removed, but their long versions still exists:
    • use --timeout instead of -t global option (-t is a sub-command option specifying task name)
    • use --delete instead of -d option to remove task from a pipeline (-d specifies user selection expression
    • use --order-pipeline instead of -o for ordering tasks in a pipeline

Couple of example for the new command line:

  • Show pipeline composition (this is currently broken, wait until it’s fixed on master):

    pipetask build -p $PIPE_TASKS_DIR/pipelines/DataReleaseProduction.yaml --show=pipeline
    
  • Run pipeline on some data from gen3 butler (this should run if pipeline configuration is correct):

    pipetask run -b $CI_HSC_GEN3_DIR/DATA/butler.yaml -d "visit = 903986" \
      -i shared/ci_hsc_output,calib/hsc -o coll-z -p pipeline.yaml --show=graph
    
  • And if lost:

    pipetask --help
    pipetask run --help
    

It is possible that interface will change again as we continue to rationalize the multitude of options (and list sub-command is going to be re-introduced), hope we can converge soon on something stable.

2 Likes