Command line tasks are likely to be first point of contact for many astronomers with the LSST Stack. This means that the experience of using Command Line Tasks is extremely important.
If the command line experience is bad, frustrating, or even unpolished, we’re probably going to lose that astronomer forever. Certainly that astronomer will be reticent to invest in learning our Python API if we allow a belief that our entire stack is badly designed and executed.
With that in mind, I wanted to start a conversation about how we can deliver the best command line experience possible. My comments here are mostly agnostic of the actual architecture. I’m focussing entirely on the look and feel of a command line task. In the title I deliberately used the software hipster term ‘UX’ (mean user experience) since I believe that we should treat the medium of the command line with the same revere as tech companies treat iPhone screens or browsers.
We’ll know we’ve succeeded in designing the command line task experience when we can give a demo and hear the audience mutter “whoa, that’s cool!” This is the design we should strive for.
I also want to disclaim two things
- I mean no offence to those whose existing code I might be claiming to be anti-patterns. I just want to help make things better.
- I know these suggestions are outside the scope of the current SuperTask design. I think it’s worth starting this discussion now, though, to ensure that our overall task roadmap takes UX into consideration.
Some issues with tasks
Task names aren’t always coherent
One of the first things that struck me about our command tasks is that they look messy. See the task list in the pipetasks bin directory. And by messy, I mean that the names and verbs of the tasks don’t present a coherent vocabulary. To me, command line tasks look like an after-thought.
Command line vocabulares can be beautiful. For example, the vocabulary for vagrant
vagrant box vagrant init vagrant up vagrant connect vagrant suspend
With a controlled vocabulary like this, the
vagrant app suddenly looks simple and knowable.
For the stack, it’s unclear what many tasks do from their name alone. dumpTaskMetadata.py is self-described in its own docstring as a tool to
Select images and report which tracts and patches they are in
I would never have guessed that. Unspecific names makes docs harder to read if one can’t find expected keywords while scanning the table of contents.
Task documentation is lacking
Even if the user has found the right task, we have the problem of documentation. We fundamentally need all tasks to be comprehensively documented in task docstrings and rendered to the LSST Stack Handbook.
But even then tasks are challenging to document because they are so configurable. It’s possible for sub-tasks to be redirected. Thus any ‘static’ documentation can be contradicted by task redirection done by the user.
A vision of the command line experience
Here I present a vision of what our command line experience could be like.
The lsst command
When we tell a new user about LSST’s task pipeline we tell them one thing: “check out the lsst app.”
All tasks are namespaced into subcommands in the same sense as sprawling command line applications like
git or our example
vagrant from above.
Tasks would then be sub-commands
> lsst process-ccd [args]
Tasks provided with instruments or by community packages would have their own command space
sdss process-ccd [args] decam process-ccd [args] megacam process-ccd [args]
You’ll also notice that in such a command architecture, we’ve done away with amateur-looking
taskName.py script names. The
lsst task signature signals to the user: these aren’t cobbled scripts; this is a well-engineered application.
When you run the root command
A user knows that that the LSST pipeline is packaged in the
lsst command. But that user doesn’t really know anything else; let alone how to run the pipeline.
The natural thing is to just type at the command line:
When this happens, we help the user! The root command prints out a small help message pointing to the online task documentation. It also goes a step further, and prints out a list of all available command line tasks.
Now without even reading the docs, the user has a list of commands to try.
(Note: to be idiomatic and safe,
lsst --help will do the same thing)
Getting help on running a task
So the user knows the commands, but how are they run and what do they do? The initial help message will the user to try any command with the
help verb, as in:
> lsst help process-ccd
This will show user-oriented task documentation, including a usage example, and a list of arguments and their defaults.
Note that this documentation should include a schematic flow of any subtasks called. The argument list should include arguments associated with the subtasks.
Since the total collection of arguments might be overwhelming, some arguments may be labeled as ‘superuser arguments’ and their defaults often assumed to be correct. By default these superuser arguments would be omitted from the
> lsst help --all process-ccd
would reveal them.
Similarly, a user might want to filter the command line help to just the base package or a certain subtask. These commands would help with that
# only arguments for process-ccd itself > lsst help --base process-ccd # help for isr,calibrate subtasks. > lsst help --sub isr,calibrate process-ccd
Graphical task help
The terminal has limited information bandwidth. Instead, we could use the
> lsst show process-ccd [args]
This launches a local static web page showing the pipeline, including task help and the values of arguments as currently set on the command line.
This is an improvement on the docs that I can ship with the LSST Stack Docs because these docs will reflect the actual state of a task given the current configuration, including redirected tasks and what arguments have been set to non-default values.
Graphical task composition
The static web server help provided by
lsst show process-ccd was nice, but why settle?
> lsst compose process-ccd
launches a graphical task composer. That is, a local python server is booted up. In this local web app, the user can actually configure and preview the task pipeline.
The user could graphically redirect a subtask to another one and dynamically see the new options that are needed.
The user could also see exactly what data would be processed given Butler data id selectors.
Once the user was satisfied, that pipeline configuration could be exported from the local web app so that the user could immediately run the pipeline in the command line.
This discussion is deliberately not about implementation, but rather about experience. Nonetheless, the experience requires these pieces of infrastructure to be implemented:
- There needs to be a task registry that not only LSST stack tasks plug into, but any third-part
obs_tasks etc plug into as well. This will allow the
lsstcommand to show a listing of all commands, and for
lsst composeto help a user redirect subtasks by showing tasks available.
- Tasks not long exist as command line scripts, but as Python modules that follow a task protocol/API.
- There needs to be an API for tasks to expose their processing task pipeline DAG, as currently configured.
I’ve designed a command line task architecture not by considering the implementation details, but by instead considering the user experience. I’ve given a realization of what UX thinking might give you. But even if this specific command line UI is not adopted, I stress that UX thinking should be used when implementing any changes to the command line task architecture.
I also think that tasks should be viewed and designed as a cohesive whole. Tasks shouldn’t just be created to suit a need and figuratively thrown into a bin/ directory. Tasks should serve as a unified vocabulary for processing data.