Constructing a "where" for query-dimension-records

In python, if I run:

butler = lsst.daf.butler.Butler( os.getenv("REPO") )
results = butler.registry.queryDimensionRecords( 'exposure',
                                                 collections='DECam/raw/all',
                                                 datasets='raw' ,
                                                 where="instrument='DECam' AND exposure.observation_type='science'"  )
results = list( set(results) )
print( results[0].observation_type )
print( results[0].timespan.begin )

I get results:

science
2021-04-09 00:37:14.816924

If I then run:

results = butler.registry.queryDimensionRecords( 'exposure',
                                                 collections='DECam/raw/all',
                                                 datasets='raw',
                                                 where="instrument='DECam' AND exposure.observation_type='science'"
                                                       " AND timespan.begin>'2021-04-10'" )
results = list( set(results) )

I get an exception: LookupError: No dimension element with name 'timespan'.

Two questions. Specifically, how do I add a range of the begin times of exposures to a search? More generally, how does one figure out the schema to construct these “where” clauses? From what I originally ran, it looks like there is a “timespan” element in the dimension record for an exposure, but it didn’t work inside where.

Addendum: I also tried:

results = butler.registry.queryDimensionRecords( 'exposure',
                                                 collections='DECam/raw/all',
                                                 datasets='raw',
                                                 where="instrument='DECam' AND exposure.observation_type='science'"
                                                       " AND exposure.timespan.begin>'2021-04-10'" )
results = list( set(results) )

and got exception: lsst.daf.butler.registry.queries.expressions.convert.ExpressionTypeError: No column 'timespan.begin' in dimension table 'exposure'.

Can you try this:

exposure.day_obs>20210410?

That is recognized; thanks. Alas, the thing I was originally doing that prompted this question was trying to narrow down to a short range of time (i.e. within a half hour or so), so day_obs isn’t really what I’m after in this case.

Great, I had this issue recently as well. I would query to get all visits on a day and then use the timespan object inside the python code to then restrict the results.

Astropy timespans cannot be compared inside the sql, I suppose.

You can look at the supported queries by just outputting:

print(results[0])

in your code.

Sorry about this - you’ve reminded me of a previously known but recently forgotten gap in both our query expression functionality and our documentation. The second thing you tried is something we ought to make work, and the workaround is not yet documented; it’s to do an OVERLAPS operation on full timespans, as sort of tangentially described here:

https://pipelines.lsst.io/v/weekly/modules/lsst.daf.butler/queries.html#overlaps-operator

For this case, I think a working where expression term and bind argument would be:

from lsst.daf.butler import Timespan

...

results = butler.registry.queryDimensionRecords(
    ...,
    where=(
        "instrument='DECam' AND exposure.observation_type='science' "
        "AND exposure.timespan OVERLAPS ts"
    ),
    bind={"ts": Timespan(astropy.time.Time(...), None)},
)

I’m using “bind” here to avoid remembering the expression system’s time literal syntax, and the None in the second argument to Timespan makes it open ended in the future. And of course the OVERLAPS construct actually does do an overlap test not a “contains” test, as I think you preferred, but they’re close enough that I hope this works well enough.

1 Like

Ah, great, thanks, that does it!

I had been trolling around and had just looked at the time literal syntax, and OVERLAPS T'2021-03-19T00:36:30' worked.