Bright object mask refactoring in the stack

natelust · April 21, 2016, 3:50pm

The current bright object mask code has the masks manually placed in the deepCoadd directory after being manually generated by some outside code. Is there a desire to change this behavior to have an agnostic location for bright object masks, possibly like the calibration frames?

The code also currently uses a work around to load in objects through the butler PurePythonClass persistable argument. Is this behavior expected to change with upcoming butler changes?

I am trying to get some clarity of the future of this code before adding code to all the camera packages so they may also make use of this code (currently only obs_subaru can use the code). It would be better to make any changes to the bright object mask code before adding additional code to many obs packages.

Does any one have comments on these questions or on bright object masks in general?
pinging a few people I think might be interested in this discussion:
@RHL @price @jbosch @rowen

jbosch · April 21, 2016, 4:17pm

Eventually we’ll have our own code for generating the masks, and I imagine it will be based on a reference catalog that’s handled in the same way we handle other reference catalogs (it may even be the same one, with different filtering).

We’ve implemented it on the HSC side by reading in DS9 region files produced externally purely due to lack of manpower, and hence I consider basically this entire feature to be technical debt.

That said, we have a lot of features to implement before we can do this right, and I think we have to continue using it in the meantime, so it may be a while before we can clean it up. That includes Butler serialization interfaces whose improvement I’m confident is already planned, as well as geometry-based mask functionality that we have only barely thought about and might ultimately delegate almost entirely to third-party tools (or might integrate tightly into our Mask class).

It may be that only HSC is likely to use this code in the foreseeable future, so just leaving it only accessible to obs_subaru might be a valid option. On the other hand, if we plan to do any big productions with other datasets and we expect people to do science on them, they’ll probably need these sorts of masks. In any case, we shouldn’t put much effort into cleaning it up, because doing that well is actually a really big project. A possible exception could be dealing with that butler workaround you mentioned; you may want to get in touch with @npease on that, though I suspect you’ll need to describe it more than you have here.

RHL · April 22, 2016, 12:05am

I am fine with moving these files to a sensible location, but I disagree with Jim about where they come from.

These files are intrinsically external in the sense that the knowledge required to generate them is different from that needed to process the pixels and may use information that we don’t otherwise need. So, while I imagine that LSST DM will take responsibility for generating a set of masks, I think that it’ll have to be structured as one package that generates mask files, and then we ingest them.

jbosch · April 22, 2016, 3:38pm

I do agree that much of this mask information will not come from the bitmasks (i.e. the pixels of a afw::image::Mask) we produce in the pixel processing, and that probably means we want to have one or more datasets that store mask information in geometric form. I’m not convinced those mask-generation steps will not use any of our catalog or pixel data, but I agree it’s conceivable they could just use external data.

I don’t think we want to assume anything about the file format in the future, though, and I got the impression (perhaps erroneously) from @natelust’s description that the persistence might be in a sufficiently ugly state that we don’t want to spread it around to other cameras until we need it for them.

davidciardi · April 26, 2016, 4:53pm

Solely from the point of view of the end-user, I would request that the mask information and files are captured in such a way that a user interacting with the data can understand what was masked and from where that information came.

RHL · April 28, 2016, 12:19pm

I agree some/most/all masks will come from our data, but that we shouldn’t make that a part of the design.

As regards file formats, I originally invented a fits binary table format with the same information as ds9 region files. However, of course, this added a new file format (just saying “fits” isn’t sufficient to use them), so I went with Jean/Nicole’s suggestion of a pre-existing format. The persistence is fine, it’s only ugly because it uses the butler backdoor of duck typing something to look like fits; this is indeed ugly, but it’s an (old) butler problem, not anything intrinsic to the file format.