[This functionality is available on the Rubin Science Platform from Weekly 2024_37 onwards]
Today I merged ticket DM-45872 which moves the new query system APIs from experimental mode to fully public mode. We have also begun to move away from butler.registry
for querying to a simpler butler
interface.
The new query system has been used under the hood in the command-line tools for a while now but now people can use the new system from Python.
The new query system has been designed to provide a more unified approach for all the different types of queries but does provide some key enhancements:
- You can now query for datasets with a
lsst.sphgeom.Region
or use the newPOINT(ra, dec)
syntax. Recently we added the ability for Regions to be constructed from simple strings using the IVOA POS definition viaRegion.from_ivoa_pos()
. - We have fixed the duplication of results problem so you no longer need to immediately pass the results to a
set()
. - There is now a simplified interface that returns a
list
and an advanced interface that allows complex queries to be built up by method chaining. - Calibration dataset queries are now supported either in the simplified interface with
find_first=False
or with the advanced interface where a temporal dimension can be used. There is also an experimental interface in the advanced system that can be used to obtain validity ranges. - Coarse spatial joins (such as visit and tract) are now supported.
- The advanced query system now allows for lists of data IDs to be used.
- All the APIs now support
limit
andorder_by
, not just querying dimension records (and those parameters are now supported on the command-line as well.
The new APIs are:
butler.query_datasets()
butler.query_dimension_records()
butler.query_data_ids()
All these APIs return lists and by default we cap the number of results at 20,000 and issue a warning if you hit that limit.
The advanced query system uses a context manager:
with butler.query() as query:
...
Additionally there is now a butler.collections
interface (see lsst.daf.butler.ButlerCollections
) to replace the butler.registry
collection APIs.
-
collections.query()
replacesregistry.queryCollections
. -
collections.get_info()
replacesgetCollectionDocumentation
,getCollectionParentChains
,getCollectionSummary
,getCollectionChain
, andgetCollectionType
. -
collections.query_info()
returns all the information available for all matching collections.
This butler.collections
interface already existed for chain manipulation but has been extended to support all the registry collection APIs.
The long term plan is to make it so that butler.registry
is no longer used. We are not deprecating the registry interfaces at this time but we hope that the new features motivate an eventual migration.