With the implementation of RFC-783 there has been a change in the way that the metadata attached to a
Task is implemented. Previously metadata was stored in a
PropertyList which was combined into a
PropertySet. Now we have a specialized class for dealing with task metadata called
TaskMetadata. This class provides some of the same methods as supported by
PropertySet and allows
. separators to refer to a hierarchy. It does not support the full API of
PropertySet and all the code which relied on
PropertySet methods for metadata has been fixed. Some compatibility methods do exist but issue deprecation warnings.
There have been some changes to Butler to support the new
butler.put()can now convert a
PropertySetif the dataset type for the task metadata is defined as a
PropertySetin the given repository.
- If a
*_metadatadataset type is used by a pipeline and the definition already exists in a Butler repository then that definition will be used when storing the metadata even though the Python code is using
TaskMetadata. This means that if an existing repo defines the task metadata to be a
PropertySetthen that will be used when writing
TaskMetadata. When those datasets are retrieved they will be returned as the expected
- If a metadata dataset type does not exist in the repository then a new one will be created using
TaskMetadatais serialized as a JSON file and not a YAML file.
- If the repository has had the dataset type for metadata modified to now indicate a
TaskMetadatastorage class, then a
TaskMetadatawill be returned by
butler.get()even though it is stored on disk in YAML as a
With these changes there should be no need to make any changes to existing repositories although depending on configuration and history you may sometimes get a slightly different python type to the one you initially stored.
We still have to decide whether to migrate existing repositories over to
TaskMetadata storage class and there is currently no butler admin script to simplify the change. For now this means that existing pipeline runs will generally still serialize as YAML
PropertySet even though all processing will be using
Note that v23 (and DP0.2) does not support
TaskMetadata and will likely never support it natively.