Unresolved butler dataset references are now deprecated

timj · April 19, 2023, 11:00pm

Historically you have been able create a DatasetRef that does not refer to any particular dataset in butler, but refers to potential datasets that share the same dataset type and dataId. We call these “unresolved” because you haven’t yet worked out which specific dataset in a specific run collection that it refers to. Calling butler.get() with an unresolved ref would be treated internally the same as using butler.get(datasetType, dataId).

Unresolved refs were useful for data ingest (you didn’t know how butler would store a new dataset in registry) or quantum graph building (we could create the graph without having to know what dataset ID the registry would use – absolutely critical when integers and not UUIDs were being used). Now that we always use a UUID in registry we no longer need unresolved refs and it is a major simplification to not have to support them. This was discussed in RFC-888 and today I have released the first part of this – using or creating an unresolved ref will now issue a warning.

In the next month or so we will change butler so that it will be illegal to use an unresolved ref. We are hoping that no-one will be affected by this change since all the DatasetRef returned by registry.queryDatasets are fully resolved and if you are using an unresolved ref in a butler.get somewhere the fix is to explicitly pass in the dataId and dataset type.

Please let us know if you start getting warnings and you need to know how to fix them. We currently hide all the warnings from usage in ingest tools and graph building so you should only see them if you explicitly construct an unresolved ref.