How to run the DM stack on simulated FITS images

fjaviersanchez · June 25, 2016, 12:24am

Hi,

I was trying to use the new version of the stack on simulated images. First, I created the _mapper file to use the LSST-Sims settings. Then, I created the registry.sqlite3 with genInputRegistry.py and it recognized all my images. The problem is that when I try to use processEimage.py it says:

 WARNING: No data found for dataId=OrderedDict([('visit', 0)])
 WARNING: No data found for dataId=OrderedDict([('visit', 1)])
 WARNING: No data found for dataId=OrderedDict([('visit', 2)])
 WARNING: No data found for dataId=OrderedDict([('visit', 3)])
 WARNING: No data found for dataId=OrderedDict([('visit', 4)])
 WARNING: No data found for dataId=OrderedDict([('visit', 5)])

And the outputs are all empty. My images are several 512 x 512 pixels outputs from the WeakLensingDeblending package, and I added the wcs, a random exptime and a random mjd on the headers in order to be properly recognized. What fields should the FITS files have in order to be processed? What am I doing wrong?

I also tried processCcd.py but it didn’t work either.

Thanks a lot!

jbosch · June 25, 2016, 3:19am

The obs_lsstSim package is designed to be used on PhoSim outputs (or other simulations that look like PhoSim outputs), which are organized the same way the actual LSST focal plane will be: 4k x 4k sensors, organized into rafts. I’m afraid it won’t work on arbitrary images like the one you’re using, and in fact there’s no way to run our regular pipelines on data like this.

Instead, I think you want to try processFile. That will get you nearly all of the single-image processing functionality.

fjaviersanchez · June 27, 2016, 5:48pm

Thanks for the answer. The problem is that I tried processFile but, it raises an error when I try to use it:

ImportError: No module named processImage```

If I create 4k x 4k images. Can I use them as inputs for processEImage and/or processCcd or do I need to include something on headers or additional tables?

fjaviersanchez · July 1, 2016, 11:44pm

I tried again using 4k x 4k images with processEimage, and processCcd. Both of them ran without complaining but the outputs were empty (if any). I tried running processFile and now it doesn’t raise the ProcessImageTask error but this one:

    parser.add_argument("-T", "--trace", nargs="*", action=pbArgparse.TraceLevelAction,
AttributeError: 'module' object has no attribute 'TraceLevelAction'```

I think that in the new stack ```lsst.pipe.base.argumentParser``` doesn't include ```TraceLevelAction``` anymore. Am I right?

josePhoenix · July 1, 2016, 11:56pm

I’m working through the same exact thing. It’s now called LogLevelAction: https://github.com/josePhoenix/processFile/commit/8e8f7e978b85c19dc1a98d3a33c6f2fa48259022

jbosch · July 8, 2016, 8:59pm

Given the difficulty getting processFile working (which I believe others are working on), @fjaviersanchez asked off-line if there was a way to use the deblender. There absolutely is; all of this code is intended to be very much usable as a Python library, and in fact for simple simulated data it may be easier (and certainly more flexible) than using scripts like processFile. Here’s a snippet that may help get you started, assuming you have an image with a constant PSF that you’d like to use instead of doing PSF modeling using our code (which is where a lot of the complexity in processfile comes from). One big caveat: I haven’t tested this, so there are probably some trivial bugs.

import lsst.afw.table
import lsst.afw.image
import lsst.afw.math
import lsst.meas.algorithms
import lsst.meas.base
import lsst.meas.deblender
import numpy

# High-level algorithms are in classes called "Tasks".  Each of these has a Config
# class to control how it runs; to use it, construct an instance of Task.ConfigClass(),
# set attributes on it, and pass that as the config kwarg to the Task constructor.
# It's usually a good idea to construct all Tasks before using any of them, mostly
# to define the full schema of any catalogs we'll produce up front.
schema = lsst.afw.table.SourceTable.makeMinimalSchema()
detect = lsst.meas.algorithms.SourceDetectionTask(schema=schema)
deblend = lsst.meas.deblender.SourceDeblendTask(schema=schema)
measure = lsst.meas.base.SingleFrameMeasurementTask(schema=schema)

# Given NumPy float32 arrays containing the image to be measured and its variance, make an LSST
# "MaskedImage" to hold them.  This also holds a mask, which we'll set to None to make a blank mask.
image = lsst.afw.image.ImageF(image_array)
variance = lsst.afw.image.ImageF(variance_array)
masked_image = lsst.afw.image.MaskedImageF(image, None, variance)

# Create a PSF object from a NumPy array of the PSF image.  This should have odd dimensions, and be
# centered on the center pixel.
psf = lsst.meas.algorithms.KernelPsf(lsst.afw.math.FixedKernel(lsst.afw.image.ImageF(psf_array)))

# Create an Exposure object that combines the MaskedImage with the PSF.  Exposures can hold
# quite a bit more (WCS, photometric zeropoint (`Calib`), but I think this is all we need for now.
exposure = lsst.afw.image.ExposureF(masked_image)
exposure.setPsf(psf)

# Run the tasks
table = lsst.afw.table.SourceTable.make(schema)  # this is really just a factory for records, not a table
detect_result = detect.run(table, exposure)
catalog = detect_result.sources   # this is the actual catalog, but most of it's still empty
deblend.run(exposure, catalog, exposure.getPsf())
measure.run(catalog, exposure)

# Catalog should now be filled.  To inspect the results, it'll be easier to first make it contiguous, and
# then get an Astropy table view:
catalog = catalog.copy(deep=True)
astropy_table = catalog.asAstropy()

Hope that helps. Please reply with any problems you encounter (and any LSST experts who notice bugs in the above should feel free to correct them directly).

KSK · July 12, 2016, 4:44pm

I didn’t see this thread until now. I have done some work to get obs_file working again (not processFile). This allows using the standard processCcdTask on arbitrary data. I hope that those trying to run simulated data will find this helpful. Instructions are in this thread.

josePhoenix · July 27, 2016, 2:56pm

I have gotten the stack to run (and pass the integration test using SDSS data suggested in the setup docs). However, I’m having trouble using this code. It seems to be modifying the image array I provide in-place (maybe it is supposed to?) and replacing it with corrupted data (which I don’t think it should do!). When I make the MaskedImage and try to get the underlying array back out (masked_image.getImage().getArray()), it looks bogus too.

Here’s the image_array, an ‘original’ copy I made, and the MaskedArray (left to right):

Here’s the same three after running the tasks:

I also made a gist with the notebook and files I was using.

josePhoenix · July 27, 2016, 3:23pm

Aha, looks like a float endian-ness problem. If I do image = lsst.afw.image.ImageF(image_array.astype(np.float32)) then I get float32s in machine byte order (little endian) instead of FITS byte order (big endian). That fixes the MaskedImage weirdness.

Now I’m getting more interesting output from the deblender:

sourceDetection: Detected 1 positive sources to 5 sigma.
sourceDetection: Resubtracting the background after object detection
sourceDeblend: Deblending 1 sources
sourceDeblend: Deblended: of 1 sources, 0 were deblended, creating 0 children, total 1 sources
measurement: Measuring 1 sources (1 parents, 0 children)
measurement WARNING: Error in base_SkyCoord.measure on record 1: Wcs not attached to exposure.  Required for base_SkyCoord algorithm
measurement WARNING: Error in base_PsfFlux.measure on record 1:
  File "src/PsfFlux.cc", line 133, in virtual void lsst::meas::base::PsfFluxAlgorithm::measure(afw::table::SourceRecord &, const afw::image::Exposure<float> &) const
    Invalid pixel value detected in image. {0}
lsst::meas::base::PixelValueError: 'Invalid pixel value detected in image.'

jbosch · July 27, 2016, 3:44pm

Good catch! Looks like our ImageF(array) constructor needs to be a bit more picky about the exact dtype it receives (I’ve created DM-7037 to fix this).

The error you’re seeing usually indicates a NaN or infinite in the image; our measurement algorithm generally assume a preprocessing step has interpolated any such values. I would expect that to generate a warning, but not an error - it’s clear that it’s at least producing a warning, but it’s not clear from the output whether there’s also an exception being raised. If there is, could you try to get a traceback?

josePhoenix · July 27, 2016, 6:08pm

No exception (/ traceback), and there is a final catalog written out with a centroid for the object. The image I’m passing in has no bad values, or at least it appears that way:

In[100]:
np.any(np.isnan(image_array))
Out[100]:
False

In [101]:
np.all(np.isfinite(image_array))
Out[101]:
True

The PixelValueError doesn’t stop it from producing a catalog, though, so it may be safe to ignore. What do you think?

jbosch · July 27, 2016, 6:42pm

The error is preventing you from getting a valid PsfFlux measurement for at least one object in the image, but if that’s not important, you can probably get away with ignoring it. It’s not immediately obvious to me what’s causing it, though - that may just require a bit more debugging.

josePhoenix · August 1, 2016, 4:06pm

Is this method something new (since v12)? I don’t appear to have it on the returned catalog object. (I’ve been writing out to .fits and reading as an Astropy table, but that seems a bit roundabout.)

ktl · August 1, 2016, 4:09pm

https://pipelines.lsst.io/releases/notes.html#release-v12-0-astropy-table-views

That method appears to be available on BaseCatalog but not SourceCatalog.

jbosch · August 1, 2016, 4:52pm

That’s a bug on our side; I just created DM-7084 to fix it.

In the meantime, here’s a workaround:

astropy_table = lsst.afw.table._syntax.BaseCatalog_asAstropy(catalog)

josePhoenix · August 8, 2016, 6:41pm

Hmm, I’m running this on a synthetic image and getting unexpected results (one source in, hundreds of deblended detections out).

I’m adapting some code of @hcferguson’s to put a Sérsic galaxy in an image with some noise and sky brightness. The image is a 256x256 pixel cutout and the variance is estimated from averaging a 10x10 pixel box in the corner. In this particular image, I only put in one galaxy profile, but I get out 757 sources in the deblending step:

sourceDetection: Detected 1 positive sources to 5 sigma.
sourceDetection: Resubtracting the background after object detection
sourceDeblend: Deblending 1 sources
sourceDeblend: Deblended: of 1 sources, 1 were deblended, creating 756 children, total 757 sources

I’m sure I’m not giving the pipeline exactly what it expects, but I’m not sure what I need to change. Here’s the notebook I’m using: https://gist.github.com/josePhoenix/3e602cc58c2edf86d946465cf5254cf2

I’ve attached the scene (as FITS) and the notebook, as well as the PSF and the catalog output by the stack tasks.

deblend_synthetic_image_test_case.zip (2.8 MB).

price · August 8, 2016, 7:35pm

One possible thing to think about is setting XY0 on your psf_image before making it into a FixedKernel (i.e., you need to tell the system where the 0,0 point is in the image).

Can you plot the mask plane after detection?

josePhoenix · August 8, 2016, 7:53pm

Hm, how would I go about plotting the mask plane? masked_image.getMask().getArray() produces an image where most pixels have the value 32 except for a border around the edge.

I’ll try setting the 0, 0 point for the PSF as well.

josePhoenix · August 8, 2016, 8:02pm

Update: I’ve added

psf_image.setXY0(lsst.afw.geom.Point2I(12, 12))

to the code that creates the KernelPSF, but it doesn’t seem to affect the number of deblended components.

price · August 8, 2016, 8:02pm

32 in a mask is usually the value assigned to the symbolic name DETECTED. That means all those pixels are above threshold.