We are creating a merged reference catalog for WFST by combining Pan-STARRS (g/r/i/z/y) and SDSS (u/g/r/i/z) data. Our goal is to use Pan-STARRS as the primary source and supplement u-band photometry from SDSS.
To do this, we have generated thousands of CSV files, with a schema like:
healpix_id,coord_ra,coord_dec,source_type,sdss_sep_arcsec,ps_id,ps_coord_ra,ps_coord_dec,ps_parent,ps_g_flux,ps_r_flux,ps_i_flux,ps_z_flux,ps_y_flux,ps_i_fluxSigma,ps_y_fluxSigma,ps_r_fluxSigma,ps_z_fluxSigma,ps_g_fluxSigma,ps_coord_ra_err,ps_coord_dec_err,ps_epoch,ps_pm_ra,ps_pm_dec,ps_pm_ra_err,ps_pm_dec_err,ps_footprint,sdss_id,sdss_coord_ra,sdss_coord_dec,sdss_u_flux,sdss_g_flux,sdss_r_flux,sdss_i_flux,sdss_z_flux,sdss_u_fluxErr,sdss_g_fluxErr,sdss_r_fluxErr,sdss_i_fluxErr,sdss_z_fluxErr,ps_g_mag,ps_g_mag_err,ps_r_mag,ps_r_mag_err,ps_i_mag,ps_i_mag_err,ps_z_mag,ps_z_mag_err,ps_y_mag,ps_y_mag_err,sdss_u_mag,sdss_u_mag_err,sdss_g_mag,sdss_g_mag_err,sdss_r_mag,sdss_r_mag_err,sdss_i_mag,sdss_i_mag_err,sdss_z_mag,sdss_z_mag_err
143800,1.2401491934085658,-0.5785334509620012,ps_only,,68220710552823264,1.2401491934085658,-0.5785334509620012,0,0.0130986403673887,0.0162190850824117,0.0172992087900638,0.0176045950502157,0.0177184622734785,0.00955989677459,0.0097915846854448,0.0089629981666803,0.0097286598756909,0.0072385766543447,0.0,0.0,1419873408.0,-0.000759758579079,-0.0074059693142771,0.0033094324171543,0.0032997245434671,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13.60700007619344,0.6000000400122356,13.374999742179286,0.6000000347305979,13.305000021391912,0.6000000443884193,13.28600052313188,0.6000000693390526,13.279000551756983,0.6000000356044211,,,,,,,,,,
143800,1.2403506985720716,-0.5782874534477238,ps_only,,68240710668420169,1.2403506985720716,-0.5782874534477238,0,0.0003948445082642,0.0007537476485595,0.0009279857040382,0.0010814981069415,0.0011706476798281,1.9196256744180573e-06,0.0006469238433055,0.000416536378907,0.0005976579268462,0.0002181991440011,0.0,0.0,1419834880.0,0.0024005575105547,-0.0075076259672641,0.0053496062755584,0.0116071421653032,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,17.409000367379097,0.6000000140073345,16.707000696540142,0.6000000504711587,16.48121240777254,0.0022459474162469,16.315001213319594,0.5999999861976275,16.229000100505232,0.6000000260550253,,,,,,,,,,
143800,1.240528342891102,-0.579019387292847,ps_only,,68180710770269891,1.240528342891102,-0.579019387292847,0,0.0031508377287536,0.005567061714828,0.0072031673043966,0.0075444234535098,0.0080394493415951,2.4798862796160392e-05,0.004442763980478,0.0030764720868319,0.0041692028753459,0.0017412174493074,0.0,0.0,1419774976.0,-0.0045610321685671,-0.0070206765085458,0.003365577198565,0.006067348178476,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.1540005293913,0.6000000596087445,14.53600053279219,0.600000017730073,14.256256867190974,0.0037379422185829,14.2060004810312,0.6000000576106188,14.136999864869004,0.6000000121703924,,,,,,,,,,
143800,1.2405629083846008,-0.5785754401720666,ps_only,,68220710790100381,1.2405629083846008,-0.5785754401720666,0,0.0118256760761141,0.0192852802574634,0.0224504955112934,0.0236606076359748,0.0247073490172624,0.0124066034331917,0.0136537859216332,0.0106574399396777,0.0130753358826041,0.0065351105295121,0.0,0.0,1419872512.0,0.0091399122029542,-0.0064788833260536,0.0032761241309344,0.0032660374417901,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13.718000675241218,0.6000000387562556,13.187000736031615,0.6000000123444055,13.022000795075792,0.600000054284627,12.96500088803995,0.6000000412301947,12.918000246933373,0.6000000120116724,,,,,,,,,,
143800,1.2408377203169474,-0.5792474266165331,ps_only,,68170710947804234,1.2408377203169474,-0.5792474266165331,0,6.020408909535036e-05,0.000141392374644,0.0002418772637611,0.0003749956085812,0.000443022523541,1.988488520510145e-06,0.000244823313551,7.813631964381784e-05,0.0002072302304441,3.327001104480587e-05,0.0,0.0,1419962624.0,0.0293840561062097,-0.0284562855958938,0.0104350410401821,0.0279433988034725,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,19.451000647823875,0.6000000343305314,18.524000651305187,0.6000000449629148,17.941078004779264,0.00892590707325,17.465000167496978,0.5999999967061043,17.284001105816493,0.6000000477776969,,,,,,,,,,
We then created the config file based on the documentation at ConvertReferenceCatalogConfig:
# The name of the output reference catalog dataset.
config.dataset_config.ref_dataset_name = "ps_and_sdss"
# Gen3 butler wants all of our refcats have the same indexing depth.
config.dataset_config.indexer['HTM'].depth = 7
# Ingest the data in parallel with this many processes.
config.n_processes = 16
# These define the names of the fields from the gaia_source data model:
# https://gea.esac.esa.int/archive/documentation/GDR2/Gaia_archive/chap_datamodel/sec_dm_main_tables/ssec_dm_gaia_source.html
config.ra_name = "coord_ra"
config.dec_name = "coord_dec"
# NOTE: these names have `_flux` appended to them when the output Schema is created,
# while the Gaia-specific class handles the errors.
config.mag_column_list = ["ps_g_mag", "ps_r_mag", "ps_i_mag", "ps_z_mag", "ps_y_mag", "sdss_u_mag", "sdss_g_mag", "sdss_r_mag", "sdss_i_mag", "sdss_z_mag"]
config.mag_err_column_map = {"ps_g_mag":"ps_g_mag_err", "ps_r_mag":"ps_r_mag_err", "ps_i_mag":"ps_i_mag_err", "ps_z_mag":"ps_z_mag_err", "ps_y_mag":"ps_y_mag_err", "sdss_u_mag":"sdss_u_mag_err", "sdss_g_mag":"sdss_g_mag_err", "sdss_r_mag":"sdss_r_mag_err", "sdss_i_mag":"sdss_i_mag_err", "sdss_z_mag":"sdss_z_mag_err"}
We ran the command:
convertReferenceCatalog ps_and_sdss catalog_config.cfg "/data/xxx/pipeline/catalog/final_catalog/*.csv"
However, we encountered the following errors:
lsst.ConvertReferenceCatalogTask INFO: Creating 131072 file locks.
lsst.ConvertReferenceCatalogTask INFO: File locks created.
lsst.ConvertReferenceCatalogTask ERROR: Failure preparing data for: /data/xxx/pipeline/catalog/final_catalog/1438xx.csv
lsst.ConvertReferenceCatalogTask ERROR: Failure preparing data for: /data/xxx/pipeline/catalog/final_catalog/1948xx.csv
lsst.ConvertReferenceCatalogTask ERROR: Failure preparing data for: /data/xxx/pipeline/catalog/final_catalog/2273xx.csv
What could be causing the “Failure preparing data” error? We suspect it might be related to:
- Missing values (we use nan for non-matched sources)?
- Missing magnitude errors for SDSS (but we do provide
sdss_u_mag_err
, etc. inmag_err_column_map
)? - Unit mismatch (we use radian degrees for RA/Dec)?
Thank you for any guidance!