Changes to collections, runs, and `pipetask` arguments in Gen3 middleware

I’ve just merged DM-21849, which implements almost all of RFC-663 (enough that I’m calling it Implemented; the remaining details will trickle in as Gen3 development continues).

This changes how collections behave conceptually in the Gen3 butler:

  • a “run” is now a special type of collection, rather than an entity that is associated with a collection;
  • the only kind of collection we had before is now called a “tagged” collection;
  • we now also have “chained” collections, which are simply an ordered list of other collections to be searched.

It changes some particularly prominent interfaces, including:

  • the arguments used to construct a Butler (a single Butler can now search an ordered list of collections, not just a single one);
  • the command-line arguments used to pass output collections to the pipetask tool.

Many lower-level interfaces (including Registry query methods) have had minor changes as well.

The RFC itself is still a good overview and description of intent and motivation, but new package docs in daf_butler:

and command-line help for pipetask should be consulted for the details.

1 Like