Using LSST Stack + obs_decam with non-Community Pipeline data

ngarrett787 · July 10, 2020, 8:30pm

I am interning this summer with the Fermilab DES Group and have been working on finding ways to use the LSST Stack and obs_decam with current DECam data. In general, our data (not from the DECam Community Pipeline) has a FITS structure with a primary HDU, a science data HDU, a mask HDU, and a weight HDU, with each file consisting of the data from separate CCDs. After looking through much of the code and default configuration files, it seems as though the scripts expect files that correspond to a single exposure with separate HDUs for each CCD, which is different from our file structure. Even when we change the configuration files extension names parameter for the ingestion tasks and processing tasks to reflect our HDU structure, the scripts have a difficult time parsing through the data and frequently fail. In particular, ingestImagesDecam.py, ingestCalibs.py, and processCcd.py will often link the files, but either fail to properly ingest data or are not able to run the various subtasks of processCcd.py. Does anyone have any advice or experience utilizing these systems with non Community Pipeline data? I have been using the most recent weekly distribution of the stack (currently w_2020_27). I can post more specific errors as necessary.

mrawls · July 10, 2020, 8:41pm

I’m curious where you got DECam FITS files that are one file per CCD - that’s not the standard format for raw DECam data (or Community Pipeline “instcal” data, as you point out). Are you able to retrieve the raw visits you desire in the standard format, i.e., a FITS file with 60-someodd extensions (one per CCD), perhaps from the NOAO/NOIRLab archive (http://archive1.dm.noao.edu or https://astroarchive.noao.edu)? If so, ingesting those raw images, flats, and biases (zeros), building master calibs, and running processCcd is absolutely possible with the LSST Science Pipelines. Trying to start from some other DECam data format is not supported.

ngarrett787 · July 10, 2020, 9:11pm

Thank you for your reply! I noticed this discrepancy when comparing files from testdata_decam and our data, and made a script that puts the data into the standard format (one file, 60ish HDUs); this has had mixed success with data ingestion (and modded config files), but still tends to create problems with the processCcd.py stage. My suspicion is that there is a difference in the metadata of our file headers vs what is expected, but other than a few candidate header items (like PROCTYPE), I haven’t been able to identify all of the key problem differences. Do you happen to know which items are required at a minimum to be included in the headers for the files to successfully pass through?

mrawls · July 10, 2020, 9:35pm

You’re a brave soul. I guess I’d suggest downloading a few recent “real raw” files from one of the archive websites and doing a painstaking 1:1 comparison.

I shall point you here and run away for now https://github.com/lsst/astro_metadata_translator/blob/master/python/astro_metadata_translator/translators/decam.py

parejkoj · July 11, 2020, 11:46pm

Note that the files in testdata_decam have been modified in various ways to reduce their size: they should not necessarily be taken to be representative of “real” DECam data.

It sounds like the data you have is post-processed data that you’ve made yourself? Where did you get it from?

The LSST Science Pipelines are made to ingest data in very specific formats, and the DECam Community Pipeline is one of the few non-raw data formats we can ingest. I think you’d have more luck either starting with raws, or writing a formatter and ingester from scratch for your specific data, than trying to shuffle your data into something that looks like the DECam CP format.