The advice for composing an ADQL query to retrieve data for a set of known objects identified by, e.g., their objectId
, is to use a WHERE ... IN ()
.
By advice I mean the recommendations in the documentation for DP0.2 and for DP0.3 queries.
There seems to be a limit on how long the list can be, but I’m not sure exactly what the limit is. The DP0.3 advice mentions 50,000 objects. Is this known?
For example, in a query using ssObjectId
on the MPCORB
table, I’ve found that passing a list of ~10,000 ssObjectId
works fine, and a list of ~50,000 didn’t cause an immediate error but did take >6 min so I stopped it, and that a list of ~100,000 caused an immediate error (pasted below).
Full version of the immediate error from a 105,133 object list.
---------------------------------------------------------------------------
DALQueryError Traceback (most recent call last)
Cell In[25], line 4
1 query2 = "SELECT mpcDesignation, mpcNumber, ssObjectId, fullDesignation "\
2 "FROM dp03_catalogs_1yr.MPCORB "\
3 "WHERE ssObjectId IN " + tempstring
----> 4 results2 = service.search(query2).to_table()
5 results2
File /opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-7.0.1/lib/python3.11/site-packages/pyvo/dal/tap.py:257, in TAPService.run_sync(self, query, language, maxrec, uploads, **keywords)
228 def run_sync(
229 self, query, language="ADQL", maxrec=None, uploads=None,
230 **keywords):
231 """
232 runs sync query and returns its result
233
(...)
253 TAPResults
254 """
255 return self.create_query(
256 query, language=language, maxrec=maxrec, uploads=uploads,
--> 257 **keywords).execute()
File /opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-7.0.1/lib/python3.11/site-packages/pyvo/dal/tap.py:1076, in TAPQuery.execute(self)
1062 def execute(self):
1063 """
1064 submit the query and return the results as a TAPResults instance
1065
(...)
1074 for errors parsing the VOTable response
1075 """
-> 1076 return TAPResults(self.execute_votable(), url=self.queryurl, session=self._session)
File /opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-7.0.1/lib/python3.11/site-packages/pyvo/dal/adhoc.py:111, in AdhocServiceResultsMixin.__init__(self, votable, url, session)
110 def __init__(self, votable, url=None, session=None):
--> 111 super().__init__(votable, url=url, session=session)
113 self._adhocservices = list(
114 resource for resource in votable.resources
115 if resource.type == "meta" and resource.utype == "adhoc:service"
116 )
File /opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-7.0.1/lib/python3.11/site-packages/pyvo/dal/query.py:319, in DALResults.__init__(self, votable, url, session)
317 self._status = self._findstatus(votable)
318 if self._status[0].lower() not in ("ok", "overflow"):
--> 319 raise DALQueryError(self._status[1], self._status[0], url)
321 if self._status[0].lower() == "overflow":
322 warn("Partial result set. Potential causes MAXREC, async storage space, etc.",
323 category=DALOverflowWarning)
DALQueryError: missing required parameter: LANG
This question has also come up in this Support topic: ADQL question: Selecting a random subset of objects in DP0.