mpcDesignation truncated when retrieving data via API using Python pyvo module

When using the pyvo module to query the ssotap service, it seems that the data in some fields are being truncated. Consider the code below:

import os
import pyvo

data_directory = 'data'

def download_data():

    token = 'YOUR_TOKEN_HERE'
    RSP_TAP_SERVICE = 'https://data.lsst.cloud/api/ssotap'
    token_str = token

    cred = pyvo.auth.CredentialStore()
    cred.set_password("x-oauth-basic", token_str)
    credential = cred.get("ivo://ivoa.net/sso#BasicAA")
    rsp_tap = pyvo.dal.TAPService(RSP_TAP_SERVICE, credential)

    # Check if the 'data' directory exists, and create it if it doesn't
    if not os.path.exists(data_directory):
        os.makedirs(data_directory)

    page_size = 50000 
    offset = 0
    start_date = 60500
    end_date = 60590
    page_number = 0
    while True:

        paged_query = f'''SELECT mpc.mpcDesignation, mpc.mpcNumber, mpc.ssObjectId, mpc.fullDesignation,
                ds.midPointMjdTai, ds.ra, ds.dec, ds.mag, ds.band,
                ss.eclipticBeta, ss.eclipticLambda, ss.phaseAngle, ss.diaSourceId
                FROM dp03_catalogs_1yr.DiaSource AS ds
                JOIN dp03_catalogs_1yr.SSSource AS ss ON ds.diaSourceId = ss.diaSourceId
                JOIN dp03_catalogs_1yr.MPCORB AS mpc ON ds.ssObjectId = mpc.ssObjectId
                WHERE ds.midPointMjdTai BETWEEN {start_date} AND {end_date} OFFSET {offset}
                '''

        # Execute the paged query
        results = rsp_tap.search(paged_query, maxrec=page_size).to_table().to_pandas()

        # Break the loop if there are no results
        if results.empty:
            break

        # Save the page to a CSV file
        csv_filename = os.path.join(data_directory, f'dp03_catalogs_1yr_MPCORB_page_{page_number}.csv')
        results.to_csv(csv_filename, index=False)

        # Increment the offset to get the next page in the next iteration
        offset += page_size
        page_number += 1

download_data()

Execute it and take a look at the mpcDesignation column and compare it with the fullDesignation column. The mpcDesignation values appear to be truncated. For instance:

mpcDesignation: 2001 AF2
fullDesignation: 2011 2001 AF26

Note the missing 6 at the end of the mpcDesignation.

If I query via the online portal, the data returned is correct. Am I doing something wrong, or is there an issue with the API & data service?

Hi @rich2020,
Apologies for the delay on replying to this. There has been a fair bit of discussion on this point. Here is some feedback from @gpdf:

We are still investigating the details, but it appears that the declaration in the DP0.3 data model of the “packed” designation mpcDesignation as being of length 8 is inconsistent with the data in the actual database table. What appears to be happening is that the TAP service is not truncating the data to meet its own description of it as an 8-character string, but that the PyVO library used in the notebook is applying the declared length to the query result it receives, and truncating the values before returning them to the caller. It’s arguable which one is a better reaction to the underlying inconsistency – but we should fix the inconsistency itself.

We are going to consult with the team that produced the DP0.3 data to determine whether the data declaration should be changed.

@Gerenjie and Pedro Bernardinelli may also have some feedback on this point.

1 Like

Hi @rich2020,
Just wanted to follow up on your post to clarify that we don’t think anything is wrong in your code/query.

I wanted to also check if the above comment resolves your issue or there is anything specifically that might be impeding you that you’d like us to figure out a work-around for?

Hi -
I also want to follow up on this to see if there was a response regarding the data declaration.
The example given had a provisional designation of 2001 AF26.
I’m not sure where the 2011 came from.
For provisional designation 2001 AF26, the result for the packed designation should be K01A26F.
During the survey, the MPC packed provisional designation system will have a new extended system (not the current one) in order to incorporate the vastly larger number of new designations that will be required each half month.

This object is numbered and its designated number is 37291.
This number is small enough that its packed number is also 37291.

Thanks @rich2020, @ryanlau, and @mbrucker.

To summarize the answer to the original question: no you’re not doing anything wrong, and no there is no issue with the API, but there are issues with the designation columns in the MPCORB table of DP0.3.

These issues are now included in the list of known issues for DP0.3: the fullDesignation column has a prefix of “2011”, and the mpcDesignation column has not been packed as expected and has an arraysize parameter of just 8 which causes a truncated version of mpcDesignation to be returned. As @ryanlau mentions, an interim fix is under consideration (e.g., raising the arraysize parameter).

However, for now, the recommendation for users is to use the fullDesignation and ignore or remove the prefix “2011”.