Possible bug: High variance outlines in assembleCoadd data products

smotherh · March 20, 2019, 5:51pm

ROWS:
Difference image (imageDifference.py)
Coadded template (assembleCoadd.py)
Single Skymap Image (makeCoaddTempExp.py)
Calexp (processCcd.py) (independently warped against another calexp)
NOAO Instcal file

COLUMNS:
Science | Mask Plane | Variance Plane

This grid of images shows the workflow of taking an NOAO instcal and producing a difference image using a coadded template. In one NOAO instcal (bottom row), there exists a cosmic ray. This cosmic ray is properly flagged in the instcal. In processCcd (or potentially in the manual warping afterwards), additional flags are added, and the variance plane is widened compared to the wtmap from NOAO. The UNWARPED version of the calexp is then fed into makeDiscreteSkyMap.py and makeCoaddTempExp.py (–makePsfMatched=True) to create row three. In this row, the variance plane is greatly increased around the cosmic ray (potentially due to the Psf matching), and there is a very visible grid pattern in the “normal” parts of the image. The coadded template from assembleCoadd.py is shown in row two. Here, there also exists a strong grid pattern (not shown), but more concerning, the clipped mean filtering that is supposed to remove the high variance seems to fail. The cosmic ray is not visible in the coadded template science image, but the outline is visible in the coadded template variance image. These values are ~1e6. This could be because the clipped mean filtering only applies to the areas of high flux in the science image, meaning that the outline of the broadened variance plane does not get removed. NOTE: these artifacts exist all over the image, and are not likely unique to cosmic rays. Finally, when another visit uses the coadded template in imageDifference.py, very high variance relics remain due to the high variance in the coadd. These values are ~1e6-1e13.

Potential problems:

High variance relics remain in the coadd variance plane.
Large-scale grid pattern exists in the noise of the coadd and tempExp variance plane.
Variance around objects like cosmic rays becomes much broader in tempExp (perhaps due to psf matching).
Mask plane in tempExp and coadd does not display properly in ds9.

Is this something people are familiar with? Is there a solution?

yusra · March 21, 2019, 2:54pm

Hi Hayden! Welcome to community.lsst. Let me first get #2 and #4 out of the way since they have simple explanations (Answers to #1 and #3 will follow in the next post).

Moire patterns are expected in the variance of warps. In fact, I can tell a lot about about the relationship between the skymap projection and your calexp by looking at the warp’s the variance plane. e.g. there’s no rotation. You can think of each pixel in the output image as a weighted mean of the overlapping pixels in the input image, where the warping kernel provides the weighting. For each pixel in the output image, a different number/weighting of pixels from the input image are contributing. The more pixels from the input image that contribute to the mean, the lower the variance of that mean, which gets recorded in the variance plane of the output image. You’re seeing the number pixels from the input image beating with the output image.

'4. Older versions of ds9 can’t read our compressed masks and display zeros. Upgrade your ds9. See Mask disappears after running detectCoaddSources.py.
Also, displaying masks as integers is not our favorite way to visualize them:

For ds9, see https://pipelines.lsst.io/getting-started/display.html for using afwDisplay to overlay masks with different colors.
For matplotlib, here’s a snippet I use:

fig = plt.figure(figsize=(12,12))
for i, (name, bit) in enumerate(mask.getMaskPlaneDict().items()):
    fig.add_subplot(4, 5, i + 1)
    arr = np.where(mask.array & 2**bit, 1, 0)
    plt.imshow(arr, origin='lower', cmap='Greys', interpolation='nearest')
    plt.title(name)
    plt.axis('off')

yusra · March 22, 2019, 3:44am

#3: The variance gets spread out (very slightly) after warping because resampling described above, and very noticeably during PSF-matching. PSF-matching usually matches a smaller PSF to a broader model. Stars become broader, everything becomes broader… In your example, you have variance on the cosmics that is orders of magnitude greater than the surrounding pixels. This means than even when the corner of the PSF-matching convolution kernel touches the cosmic, the contribution is noticeable, and the result is the variance of the cosmic increasing by the size of the kernel.

Why is the variance 1e14 you might ask? Because the community pipeline set the weights to zero, to indicate “undefined” for bad pixels (That would be infinite variance). The stack can’t handle negative, inf, or NaN variance values, so we reset them to something outrageously large.

github.com

lsst/obs_decam/blob/master/python/lsst/obs/decam/decamMapper.py#L208


    idxBleed = np.where(dqmArr & 64)
    idxEdge = np.where(dqmArr & 512)
    mArr[idxBad] |= mask.getPlaneBitMask("BAD")
    mArr[idxSat] |= mask.getPlaneBitMask("SAT")
    mArr[idxIntrp] |= mask.getPlaneBitMask("INTRP")
    mArr[idxCr] |= mask.getPlaneBitMask("CR")
    mArr[idxBleed] |= mask.getPlaneBitMask("SAT")
    mArr[idxEdge] |= mask.getPlaneBitMask("EDGE")
    return mask


def translate_wtmap(self, wtmap):
    wtmArr = wtmap.getArray()
    idxUndefWeight = np.where(wtmArr <= 0)
    # Reassign weights to be finite but small:
    wtmArr[idxUndefWeight] = min(1e-14, np.min(wtmArr[np.where(wtmArr > 0)]))
    var = 1.0 / wtmArr
    varim = afwImage.ImageF(var)
    return varim


def bypass_instcal(self, datasetType, pythonType, butlerLocation, dataId):
    # Workaround until I can access the butler

Feel free to submit a PR to change this.

yusra · March 22, 2019, 4:32am

First off, that difference image is gorgeous.

And finally #1. The behavior that I expect from either of these two cases:

A) If the CR has been interpolated over in ISR and DOES NOT appear in the image plane of the warp, I expect those pixels to contribute to the coadd, and therefore I expect the affected pixels to have higher variance in the coadd.
B) If the CR has NOT been interpolated over and DOES appear in the image plane of the warp, then I expect it to be clipped as an outlier and it NOT appear in the variance plane.

.

(Figure: Example of case A. left: variance, right: image, top: warp, bottom: coadd) for HSC-I 9615 5,5 24494.)

(Figure: Example of case B. left: variance, right: image, top: warp, bottom: coadd) for HSC-I 9615 5,5 24494.)

This is the behavior I’ve been seeing in both HSC and Decam images.
Note that by default, we usually run our own ISR on Decam images, so in general they’re not subject to the community pipeline undefined-variance described in #3.

Now, your cosmic ray’s variance is unlike anything I’ve seen on our HSC/Decam data, with the low-valued core and high-valued edges. This is an example of case B. It appears in the warp, and was removed from the coadd by the artifact rejection. My hypothesis is the inf variance from the community pipeline got broadened more than the mask plane. If you send over the fits and config I’d confirm.

What’s your second image from the bottom? I don’t understand what “independently warped against another calexp” means.

yusra · March 22, 2019, 4:38am

Also, some examples of how calexp masks propagate to coadd masks:

smotherh · March 22, 2019, 6:17pm

Thank you for your detailed responses. When I said “independently warped against another calexp”, I mean that we used the following code script to warp our stacks of images to the common sky plane of the first image:

import sys
from os import path, mkdir
from lsst.afw import image as afwImage
from lsst.afw.math.warper import Warper
import pickle

diffim_topdirstr = "/astro/store/epyc/users/smotherh/DECAM_Data_Reduction/warps_calexp/{0:03d}/calexp"
save_topdirstr = "/astro/store/epyc/users/smotherh/DECAM_Data_Reduction/warps_calexp/{0:03d}/"

def warp_field_images(groupid):    
    group_id = int(groupid)

    with open('PickledPointings.pkl', 'rb') as f:
        Pointing_Groups = pickle.load(f)
    visit_num = Pointing_Groups[group_id]['visit_id']

    warper = Warper("lanczos4")

    for chip_num in range(1, 63):
        template_set = False

        for visit_id in visit_num:
            #                 topdir                               | visitdir | diffexp name     
            #/gscratch/scrubbed/dinob/imdiff/{groupid:03d}/deepDiff/v{visitid}/diffexp-{ccd:02d}.fits
            diffim_topdir = diffim_topdirstr.format(group_id)
            visit_dir = "{0:02d}/".format(chip_num)
            diffexp_name = "calexp-{visit:07d}_{ccd:02d}.fits".format(visit=visit_id,ccd=chip_num)
            diffexp_path = path.join(diffim_topdir, visit_dir, diffexp_name)

            #                 save dir                  | warp name    
            #/gscratch/scrubbed/dinob/warps/{groupd:03d}/{0:03d}-{1:02d}.fits
            save_dir = save_topdirstr.format(group_id)
#            warp_name = "{0:03d}-{1:02d}.fits".format(visit_id, chip_num)
            warp_name = "{0:03d}.fits".format(visit_id)
            save_path = path.join(save_dir, "{0:02d}".format(chip_num),warp_name)

            try:
                assert(path.exists(diffexp_path))
            except AssertionError:
                print("Assertion Error, path %s does not exist!" % diffexp_path)
                continue

            if not path.exists(save_dir):
                mkdir(save_dir)

            exp = afwImage.ExposureF(diffexp_path)

            if template_set is False:
                template = exp
                template_set = True
                exp.writeFits(save_path)
            else:
                warpedExp = warper.warpExposure(template.getWcs(), exp,
                                                destBBox=template.getBBox())
                warpedExp.writeFits(save_path)

if __name__ == "__main__":
    if len(sys.argv) == 2:
        warp_field_images(sys.argv[1])
    else:
        raise TypeError(("Wrong number of arguments. " 
                         "Expected groupid, got {0}".format(sys.argv[1:])))

Your examples of how a calexp mask propagates to a coadd mask don’t render for me. It seems like the server times out? (Edit- They show up now).

I would be more than happy to send the fits files over. They are currently on the University of Washington’s machine Epyc. What is the best way to send them over? (Or perhaps you have a login available on Epyc?)

Updating ds9 to v8.0 fixed the issue with the masks on Linux.

mrawls · March 22, 2019, 11:29pm

Thank you @smotherh for asking these questions and @yusra for your detailed answers! It’s all quite interesting and relevant to my ap_pipe work.

Jumping in to ask @yusra … is there code somewhere to make those little mask grid images? And is there documentation anywhere for which part of the myriad of processing steps creates each mask?

yusra · March 23, 2019, 1:55am

Is there a shared disk mounted on epyc thats also visible from gateway.astro?

smotherh · March 23, 2019, 3:33pm

Hm, it’s a network disk. I’m not sure if it’s mounted on gateway. If you can log in to gateway, it is likely you can log in to epyc, though. If not, we can easily get you an account, provided you have an UW net id. The butler repo is on /astro/store/epyc/users/smotherh/DECAM_Data_Reduction/pointing_groups_hyak/Pointing_Group_300, however the internal paths are messed up because we did the computing on the local HPC cluster and moved it over to epyc for permanent storage. The main difference is that the raw data tree is at /astro/store/pogo4/jbkalmbach/NEO. If you have any problems, please let me know. If need be, I can re-ingest and re-process this set of data (what we call Pointing Group 300) to make things a little easier.

In case I didn’t mention it earlier, the url for epyc is epyc.astro.washington.edu.