DM-36162 adds a third pipeline executor to the Science Pipelines. SeparablePipelineExecutor
was designed for the Prompt Processing framework’s needs, but it’s flexible enough that other developers may find it useful.
Compared to the existing SimplePipelineExecutor
(run from Python, no support for multiprocessing or anything other than immediate execution into a fresh run) and CmdLineFwk
(run from shell as pipetask
, lots of options and features), SeparablePipelineExecutor
is intermediate in functionality: it is run from Python, and supports multiprocessing, skipping completed quanta, and overwriting existing datasets, but not saving/visualizing graphs, automatic collection management, or profiling/statistics.
SeparablePipelineExecutor
also has two features that neither of its predecessors has:
- each execution step is run independently, accepting and returning the objects (
Pipeline
,QuantumGraph
, etc.) needed for the other steps. This is similar to howCmdLineFwk
lets you save/load graphs from disk, but is completely in-memory. - you can provide your own implementations of certain Middleware APIs (currently
TaskFactory
,GraphBuilder
, andQuantumGraphExecutor
) to customize the execution for specific applications. The default is to use the same classes (and init arguments) asCmdLineFwk
.
Note that the API for this class is not quite stable:
- Pre-execution support will need to be completely rewritten after DM-38041, which is why the current class doesn’t offer a custom
PreExecInit
hook. -
GraphBuilder
was not designed as a generic interface, so there’s no guarantee that a future custom builder will be able to use the method signature currently assumed bySeparablePipelineExecutor
.