L.sherlock_objects skips all data if it doesn't find one

lydiam · March 19, 2024, 2:41pm

Hi,

I have a list of objects to query but because it doesn’t find one of them skips the whole list. Is there a way to overcome that?

My code is like that:

sp = [ts[x:x+100] for x in range(0, len(ts), 100)]
c = []
for i in sp:
try:
c.append(L.sherlock_objects(i))
except LasairError as e:
print(e)

where ts is a list of 1000 objects so I break it into smaller lists of 100.

Thank you.

Lydia

roy · March 19, 2024, 4:38pm

I tried it on the example notebook, but replacing one of the object Ids with ZTF23_______, and I get: LasairError: Bad Request:{"error":"{"message": "Object ZTF23_______ not found"}}. Do you see this error also? Should we just put a None in the response rather than raising the error and quitting?

lydiam · March 20, 2024, 10:12am

Hi,

Thank you for your reply! Yeah, I do get the same error. I will try this solution.

Lydia

ChristinaWilliams · April 2, 2024, 9:59pm

Hi @lydiam ! I just wanted to check in about the issue you were having, and whether or not the above post resolved your problem? Thanks!

lydiam · April 8, 2024, 11:39am

Apologies for the delay.

I used:

for i in sp:
try:
c.append(L.sherlock_objects(i))
except LasairError as e:
None

instead of print(e) but I still have the same issue.

roy · April 8, 2024, 2:25pm

The real problem here is that the API should behave differently with a list of objects. Instead of falling over when a single object fails, it should return all the successes with NULL or something for the failed one. So I will make this bugfix. (Of course we are working hard on the new Lasair – for LSST – and said to ourselves no messing about with the ZTF code, its supposed to be frozen.)

lydiam · April 9, 2024, 2:39pm

Thanks for this Roy! Probably I will try a workaround at least for the lists that “fail” by querying each object in this list. Thank you again for all your help!

roy · April 15, 2024, 12:55pm

Hi Lydia – I think its fixed, now just returns None for the missing object. Can you give it a try?

lydiam · April 15, 2024, 1:55pm

Hi Roy, what is the right way to handle the error in terms of code?

roy · April 15, 2024, 2:24pm

objectIds = ['ZTF17_____etfg', 'ZTF17aaaetet', 'ZTF17aaaetes', 'ZTF17aaaeteo']
rows = L.objects(objectIds)
for (objectId,row) in zip(objectIds,rows):
    if not row:
        print('%s: not found' % objectId)
    else:
        od = row['objectData']
        print('%s is at galactic latlon %.2f,%.3f' % (objectId, od['glatmean'], od['glonmean']))

results in

ZTF17_____etfg: not found
ZTF17aaaetet is at galactic latlon -31.62,129.895
ZTF17aaaetes is at galactic latlon -31.96,129.973
ZTF17aaaeteo is at galactic latlon -31.90,130.475

lydiam · April 15, 2024, 6:32pm

Hi Roy,

It does indeed work for L.objects but it still fails for L.sherlock_objects. And when trying to do something like:

objectIds = ['ZTF17_____etfg', 'ZTF17aaaetet', 'ZTF17aaaetes', 'ZTF17aaaeteo']
rows = L.objects(objectIds)
for (objectId,row) in zip(objectIds,rows):
    if not row:
        print('%s: not found' % objectId)
    else:
        od = row['objectData']
        print(c.append(L.sherlock_objects(objectId)))
        print('%s is at galactic latlon %.2f,%.3f' % (objectId, od['glatmean'], od['glonmean']))

it returns the following error:

LasairError                               Traceback (most recent call last)
Cell In[32], line 8
      6 else:
      7     od = row['objectData']
----> 8     print(c.append(L.sherlock_objects(objectId)))
      9     print('%s is at galactic latlon %.2f,%.3f' % (objectId, od['glatmean'], od['glonmean']))

File ~/miniconda3/lib/python3.10/site-packages/lasair/lasair.py:188, in lasair_client.sherlock_objects(self, objectIds, lite)
    178 """ Query the Sherlock database for context information about objects
    179     in the database.
    180 args:
   (...)
    185     list of dictionaries, one for each objectId.
    186 """
    187 input = {'objectIds':','.join(objectIds), 'lite':lite}
--> 188 result = self.fetch('sherlock/objects', input)
    189 return result

File ~/miniconda3/lib/python3.10/site-packages/lasair/lasair.py:81, in lasair_client.fetch(self, method, input)
     78     except:
     79         pass
---> 81 result = self.fetch_from_server(method, input)
     83 if 'error' in result:
     84     return result

File ~/miniconda3/lib/python3.10/site-packages/lasair/lasair.py:50, in lasair_client.fetch_from_server(self, method, input)
     48 elif r.status_code == 400:
     49     message = 'Bad Request:' + r.text
---> 50     raise LasairError(message)
     51 elif r.status_code == 401:
     52     message = 'Unauthorized'

LasairError: Bad Request:{"error":"{\"message\": \"Object Z not found\"}\n"}

roy · April 16, 2024, 10:39am

Lydia – Long discussion this morning on if/how the API should be modified. We have decided that it is too complicated to modify the Sherlock calls on the old system, so sorry. We have also decided that users should not be sending 1000 objects to the API, since we are an alert system, not a mining system, so there will be a limit on Lasair-LSST. Hope this helps – Roy

lydiam · April 16, 2024, 1:08pm

This is fair. I will see if there’s another way to collect the data as it is useful for my study. Thank you again for all your help!

gpfrancis · April 17, 2024, 10:20am

Hi Lydia,

I’m not sure exactly what data you need here, but if it is only the information in the “lite” output that you need (i.e. only Sherlock’s highest ranked crossmatch) then it may be more efficient to query the Lasair database directly using the query API rather than hitting Sherlock. Perhaps it is not explained well in the documentation, but the former is a simple database query whereas the latter is a request to have Sherlock recompute the crossmatches for the given sky position, which is obviously a much bigger task and is only really required if you need the full output from Sherlock.

If you do need the full Sherlock output then it may be possible to get the behaviour that you want by doing the operation in two stages: first use the objects API to get a list of ra and dec; handle any errors and missing entries here such that you have a list of good positions; finally query Sherlock using sherlock_position instead of sherlock_object.

Examples:

Get the top Sherlock classification from the Lasair database for a list of objectIDs. You probably don’t want to pass too large a list here, but it should be a fast query:

    results = L.query("objects.objectId, sherlock_classifications.*",
                  "objects,sherlock_classifications",
                  "objects.objectId IN ('ZTF24aahszxf', 'ZTF19adnwaws')")

What this should do get a list of positions by doing a query (you could also get this from objects, but query is more efficient), turn the output into two lists (ra and dec) and then run Sherlock at that list of positions and get the full list of crossmatches:

    objects = L.query("objects.objectId, objects.ramean, objects.decmean",
                  "objects",
                  "objects.objectId IN ('ZTF24aahszxf', 'ZTF19adnwaws')")
    ralist = list(map(lambda obj: obj['ramean'], objects))
    declist = list(map(lambda obj: obj['decmean'], objects))
    results = L.sherlock_position(ralist, declist, lite=False)

Note that this doesn’t actually work right now though due to what I’m fairly sure is a bug on our side - please let us know if it’s one that we should prioritise for fixing.

Incidentally when Roy says that we don’t really want to be doing large queries on Sherlock we are primarily concerned with the effect of running large batches of, say, 1000 positions at a time; splitting the query into batches of 100 (which I think you may already be doing) is much kinder to the system and our expectation is that we will probably set the batch size limit to something around 100 when we get around to enforcing one.

lydiam · April 18, 2024, 9:25pm

Oh, you are right! L.query does indeed solve the issue of getting the classifications for my objects. And pretty quickly too!

Thank you both!