imageDifference task with fakes inserted into calibrated exposures

surhudm · July 20, 2021, 5:19pm

I am using LSSTpipe v21_0_0 to insert fakes into calibrated exposures from HSC using the processCcdWithFakes.py script. I followed the instructions from New Tasks for Fake Source Insertion . After this step I get the fakes_calexp and fakes_src files created and output in a tractwise manner. I would like to use the task imageDifference.py to generate difference images for this calexp in which I inserted the fake objects.

However, it was not clear to me how to tell imageDifference.py to use the fakes and whether it would then output the difference images generated in a tractwise manner.

By some brute force changes to the imageDifference.py script, I got the code running by asking it to use the fakes_calexp instead of the calexp. But I was not able to tell the code to output the corresponding files in tract wise directories. I suppose there is a better way to do this. Could some one help me out here?

I am not sure if these scripts correspond to Gen2/Gen3. My guess is that this is Gen2 as I am not suing any yaml files. I am open to carrying out these tasks in Gen3 if it simplifies. I could not find any tutorial regarding this though, so a pointer will be very helpful.

kfindeisen · July 20, 2021, 7:28pm

Yes, imageDifference.py and similarly-named command-line scripts are Gen 2. Unfortunately, Gen 2 does not provide a framework for choosing which inputs are loaded (unless the task itself provides one), so a brute-force change to the hardcoded values is the only approach.

In Gen 3, the input dataset can be specified as part of the configuration: config.connections.exposure = "fakes_calexp", or preferably config.connections.fakesType = "fakes_", which will automatically set all affected inputs/outputs as appropriate.

Unfortunately, there is not yet an official tutorial for processing data with Gen 3 (see Recreating the LSST Science pipeline tutorial (gen 2) only using Generation 3 commands link tasks and the pipetasks). A colleague recommended Stack Club as the best existing resource for getting started. You may be able to convert your existing repository using the butler convert command-line utility, to avoid having to start over from scratch.

I’m not sure I understand what you mean by “in a tractwise manner”, but I do not think this is possible – the granularity of a dataset (per-detector, per-tract, etc.) is a fundamental property, and difficult to change in either Gen 2 or Gen 3.

surhudm · July 21, 2021, 4:42am

Thanks @kfindeisen ! It is good to know that brute-force approach I took is not totally crazy.

Let me explain what I mean by “tractwise manner”.

The outputs of processCcdWithFakes.py are stored like this:

fakes_calexp:
    template: '%(pointing)05d/%(filter)s/tract%(tract)d/fakes_calexp-%(visit)07d-%(ccd)03d.fits'

So a given visit, ccd might fall in multiple tracts. When I carry out the difference imaging, each fakes_calexp can be differenced with a given coadded template, so I would suppose the difference image will also carry the tract number with it. If you look at the $OBS_SUBARU_DIR/policy/HscMapper.yaml file, we see:

deepDiff_differenceExp:
    template: 'deepDiff/%(pointing)05d/%(filter)s/DIFFEXP-%(visit)07d-%(ccd)03d.fits'

without a tract label. I suppose an issue can arise in regions where the CCD falls at the overlaps of tracts. Is there a way to tell the imageDifference.py to use only the tract where the entire CCD falls in the chip. Or does it do this automatically? What happens if parallel processes are trying to process the same fakes_calexp with the same visit ccd, but in different tracts. There will be some race conditions while writing the output files. I wanted to know if there is a way to avoid this. For example, can I brute force modify the HSCMapper.yaml to include a tract , e.g. like this:

deepDiff_differenceExp:
    template: 'deepDiff/%(pointing)05d/%(filter)s/tract%(tract)d/DIFFEXP-%(visit)07d-%(ccd)03d.fits'

surhudm · July 21, 2021, 11:45am

@kfindeisen Ok I managed to get this running now. It is a brute force approach, but I am writing it out so that others can use in a similar situation:

First I modified the python/lsst/pipe/tasks/imageDifference.py script as:

455c455,458
<         butlerQC.put(outputs, outputRefs)
---
>         # SM edits
>         print("1:", outputRefs, "fakes_"+outputRefs)
>         butlerQC.put(outputs, "fakes_"+outputRefs)
>         # butlerQC.put(outputs, outputRefs)
500c503,505
<         exposure = sensorRef.get("calexp", immediate=True)
---
>         # Retrieve the science image we wish to analyze SM change
>         exposure = sensorRef.get("fakes_calexp", immediate=True)
>         #### exposure = sensorRef.get("calexp", immediate=True)
505c510,512
<         if sensorRef.datasetExists("src"):
---
>         # SM change
>         if sensorRef.datasetExists("fakes_src"):
>         #### if sensorRef.datasetExists("src"):
508c515,516
<             selectSources = sensorRef.get("src")
---
>             selectSources = sensorRef.get("fakes_src")
>             #### selectSources = sensorRef.get("src")
522a531
>         # SM edits
524c533,535
<             sensorRef.put(results.diaSources, self.config.coaddName + "Diff_diaSrc")
---
>             #sensorRef.put(results.diaSources, self.config.coaddName + "Diff_diaSrc")
>             print("2:", "fakes_" + self.config.coaddName + "Diff_diaSrc")
>             sensorRef.put(results.diaSources, "fakes_" + self.config.coaddName + "Diff_diaSrc")
526c537,539
<             sensorRef.put(results.warpedExposure, self.config.coaddName + "Diff_warpedExp")
---
>             #sensorRef.put(results.warpedExposure, self.config.coaddName + "Diff_warpedExp")
>             print("3:", "fakes_" + self.config.coaddName + "Diff_warpedExp")
>             sensorRef.put(results.warpedExposure, "fakes_" + self.config.coaddName + "Diff_warpedExp")
528c541,543
<             sensorRef.put(results.matchedExposure, self.config.coaddName + "Diff_matchedExp")
---
>             #sensorRef.put(results.matchedExposure, self.config.coaddName + "Diff_matchedExp")
>             print("4:", "fakes_" + self.config.coaddName + "Diff_matchedExp")
>             sensorRef.put(results.matchedExposure, "fakes_" + self.config.coaddName + "Diff_matchedExp")
530c545,547
<             sensorRef.put(results.selectSources, self.config.coaddName + "Diff_kernelSrc")
---
>             #sensorRef.put(results.selectSources, self.config.coaddName + "Diff_kernelSrc")
>             print("5:", "fakes_" + self.config.coaddName + "Diff_kernelSrc")
>             sensorRef.put(results.selectSources, "fakes_" + self.config.coaddName + "Diff_kernelSrc")
532c549,551
<             sensorRef.put(results.subtractedExposure, subtractedExposureName)
---
>             #sensorRef.put(results.subtractedExposure, subtractedExposureName)
>             print("6:", "fakes_" + subtractedExposureName)
>             sensorRef.put(results.subtractedExposure, "fakes_" + subtractedExposureName)
1129c1148,1149
<         parser.add_id_argument("--id", "calexp", help="data ID, e.g. --id visit=12345 ccd=1,2")
---
>         parser.add_id_argument("--id", "fakes_calexp", help="data ID, e.g. --id visit=12345 ccd=1,2")
>         #### parser.add_id_argument("--id", "calexp", help="data ID, e.g. --id visit=12345 ccd=1,2")

Then, the $OBS_SUBARU_DIR/policy/HscMapper.yaml as:

185a186,191
>   fakes_deepDiff_differenceExp:
>     template: 'deepDiff/%(pointing)05d/%(filter)s/tract%(tract)d/DIFFEXP-%(visit)07d-%(ccd)03d.fits'
>   fakes_deepDiff_warpedExp:
>     template: 'deepDiff/%(pointing)05d/%(filter)s/tract%(tract)d/WARPEDEXP-%(visit)07d-%(ccd)03d.fits'
>   fakes_deepDiff_matchedExp:
>     template: 'deepDiff/%(pointing)05d/%(filter)s/tract%(tract)d/MATCHEDEXP-%(visit)07d-%(ccd)03d.fits'
434a441,446
>   fakes_deepDiff_diaSrc:
>     persistable: SourceCatalog
>     python: lsst.afw.table.SourceCatalog
>     storage: FitsCatalogStorage
>     tables: raw_skytile
>     template: 'deepDiff/%(pointing)05d/%(filter)s/tract%(tract)d/DIASRC-%(visit)07d-%(ccd)03d.fits'
436a449,450
>   fakes_deepDiff_kernelSrc:
>     template: 'deepDiff/%(pointing)05d/%(filter)s/tract%(tract)d/KERNELSRC-%(visit)07d-%(ccd)03d.fits'

kfindeisen · July 21, 2021, 5:04pm

One can certainly name a specific tract using the --id command-line argument. I’m not sure what the Gen 2 framework does otherwise. I believe imageDifference.py has some specific code for this purpose (since one does not normally need to specify the tract when differencing a ccd-based calexp and a patch-based coadd into a ccd-based diffim), but I can’t imagine that it was designed with this situation in mind.