Adding new data to a butler gen3 [DECam]

Hi all,

I already reduced a set of DECam images using the steps from the Merian Data Processing guide

I now want to include new observations and sets of calibrations and am getting an error. I ingested the new raw data, and made the master bias (output to ‘DECam/calib/jhernandez/bias/’), but at the moment of running the step of generating crosstalk sources I am getting

ValueError: Output CHAINED collection ‘DECam/calib/jhernandez/crosstalk’ exists, but it ends with a different sequence of input collections than those given: ‘DECam/calib/jhernandez/bias/20230619T195847Z’ != ‘DECam/raw/all’ in inputs=(‘DECam/raw/all’, ‘DECam/calib/jhernandez/bias/20230620T191020Z’, ‘DECam/calib/jhernandez/bias/20230619T195847Z’, ‘DECam/calib/jhernandez/bias/20230403T204634Z’, ‘DECam/calib/jhernandez/bias/20230330T133023Z’, ‘DECam/calib/curated/19700101T000000Z’, ‘DECam/calib/unbounded’) vs DECam/calib/jhernandez/crosstalk=(‘DECam/calib/jhernandez/crosstalk/20230410T001625Z’, ‘DECam/calib/jhernandez/crosstalk/20230410T000303Z’, ‘DECam/calib/jhernandez/crosstalk/20230410T000154Z’, ‘DECam/raw/all’, ‘DECam/calib/jhernandez/bias/20230403T204634Z’, ‘DECam/calib/jhernandez/bias/20230330T133023Z’, ‘DECam/calib/curated/19700101T000000Z’, ‘DECam/calib/unbounded’).

I am running exactly the same command as in the first run, in this case

pipetask --long-log run --register-dataset-types -j 12
–skip-existing
-b $REPO --instrument lsst.obs.decam.DarkEnergyCamera
-i DECam/raw/all,DECam/calib/jhernandez/bias,DECam/calib/unbounded
-o DECam/calib/jhernandez/crosstalk
-p $DRP_PIPE_DIR/pipelines/DECam/DRP-Merian.yaml#step0
-d “instrument=‘DECam’ AND exposure IN $FLATEXPS” \

I imagine I have to do something different from when all is run for the first time, but I can’t figure out what.

Hi Joaquín, the error you’re getting is to be expected here. You’re trying to modify one of your CHAINED collections on-disk by adding new data into it. If you were able to do this, you may change the provenance for your existing data reductions which rely on this input CHAINED collection.

There are ways around this if you’re absolutely sure that you don’t need that provenance preserved. However, I think perhaps a safer approach in this instance might instead be for you to output your new crosstalk corrections to an entirely new collection, e.g.:

pipetask --long-log run --register-dataset-types -j 12 \
–skip-existing \
-b $REPO --instrument lsst.obs.decam.DarkEnergyCamera \
-i DECam/raw/all,DECam/calib/jhernandez/bias,DECam/calib/unbounded \
-o DECam/calib/jhernandez/crosstalk_202306 \
-p $DRP_PIPE_DIR/pipelines/DECam/DRP-Merian.yaml#step0 \
-d “instrument=‘DECam’ AND exposure IN $FLATEXPS” 

Then, when running data reductions downstream, you can add both your old crosstalk collection and your new one as part of a comma-separated list, e.g.:

-i DECam/raw/all,DECam/calib/jhernandez/crosstalk,DECam/calib/jhernandez/crosstalk_202306,...
1 Like

Thank you very much Kevin!