Query objects providing a list of IDs

Hi,
I would like to execute a query in ordert to select all the entries in a certain table by providing a list as input. Basically, I want to obtain nothing different that an inner join between the selected table and the input list itself.

The task would be very easy by using pandas, but putting an entire Rubin table into a dataframe is crazy.
I was trying a work-around with a loop like this, but is extremely slow and unefficient.

Is there a smarter way to do it?

Thanks!

%%time
object_list = []
for candidate in candidate_list:
    results = service.search("SELECT id, match_candidate, match_objectId, id_truth_type "\
                            "FROM dp02_dc2_catalogs.MatchesTruth "\
                            "WHERE id = "+str(candidate))
    results = results.to_table().to_pandas()
    # Append objectId to the final list
    object_list.append(results['match_objectId'].values)
    del results

Let me take a look at this, Vincenzo.
It is possible to do inner joins directly as part of the TAP query, but inner joins can be notoriously slow, unless one is joining on a primary key (and sometimes even then!).
I have a couple other things “in the queue” ahead of your question, but I hope to have a deeper answer for you in a few hours.

2 Likes

Hi Vincenzo, I think what you maybe be looking for is the SQL “SELECT WHERE IN” construct:

SELECT id, match_objectId, id_truth_type
FROM dp02_dc2_catalogs.MatchesTruth
WHERE id IN (...)

…where you would substitute in a comma-separated list of candidate IDs for the “…” above. See also SQL IN Operator

3 Likes