Source Association Problem - Multiple DiaObjects at same position

There is a problem with source association, where multiple DiaObjects exist at the same position within a 0.5’’ radius. Based on the source association pipeline tasks, DiaSources within a 0.5’’ radius should be matched to the pre-existing DiaObjects instead of having a new one created, but this does not seem to be happening in DP0.2.

I ran a test to see how prevalent this issue is by grabbing 1000 random DiaObjects and then doing a coordinate search to see how many duplicate DiaObjects are located at the ra and dec of each 1000 DiaObject.

nDiaSources_min = 25

results = service.search("SELECT TOP 1000 "
                         "ra, decl, diaObjectId, nDiaSources "
                         "FROM dp02_dc2_catalogs.DiaObject "
                         "WHERE nDiaSources > "+str(nDiaSources_min)+" ")
DiaObjs = results.to_table()
del results

NDup = np.zeros(len(DiaObjs))
for i in np.arange(len(DiaObjs)):
    ra = DiaObjs['ra'][i]
    decl = DiaObjs['decl'][i]
    results = service.search("SELECT ra, decl, diaObjectId, diaSourceId, ccdVisitId,"
                             "filterName, midPointTai "
                             "FROM dp02_dc2_catalogs.DiaSource "
                             "WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), "
                             "CIRCLE('ICRS'," + str(ra) + ", "
                             + str(decl) + ", 0.000139)) = 1 ", maxrec=100000) #0.5'' radius coordinate search
    DiaSrcs = results.to_table()
    del results
    NDup[i]=len(list(set(DiaSrcs['diaObjectId'])))

From running the code above and executing the following:

len(NDup[NDup>1])/len(NDup)

I find that ~70% of the DiaObjects have at least one more DiaObject as the same position.

Since this issue could impact Rubin science (e.g. transient and variable statistics, light curves), it would be great to understand and address what might be causing this.

Lastly, I’ll note that this problem is related to my previous post, here:

Yes, this is a known issue, or rather several interacting issues; see [DM-41518] Investigate duplicate diaSourceIds in DM-37699 - Jira. I think @isullivan could give you more details.

The association pipeline that’s relevant for DM-37699 is different that what was used to associate DP0.2, so it’s not obvious to me that they would be directly related. There is a common thread, however, which is that race conditions between batch workers could create multiple DIAObjects where only one should exist. My hypothesis is that that’s the source of this issue.

Thanks for the feedback on this @kfindeisen and @ebellm. OK, so from what I understand the DiaObject duplication issue likely isn’t associated with the DM-37699 issue?

Eric, you mention that may be due to rare conditions from the batch workers – would you happen to know if there’s a way to investigate if that’s the origin of the issue? Given that the issue seems to affect ~70% of the DiaObjects, it does not seem that the occurrence of the duplication is rare.

A straightforward way to check if it’s due to race conditions would be to rerun the association step on a subset of data using a single thread.

That sounds like a good idea, thanks. Also, I thought your initial post mentioning “race conditions” was a typo, where I thought you meant “rare conditions” – I think I see what you mean now.