Has anyone tried using the stack with tile compressed FITS images? My attempt to tile-compress some calexps for jointcal’s validation data results in jointcal producing errors (for reasons I haven’t fully tracked down, but which may have to do with the handful of headers that change in the file when it is tile compressed).
In general, it would be useful for us to support FITS tile compression if we are going to use FITS files: it allows one to still potentially mostly memory map the data and access the headers without having to decompress the files. Tile compression is lossless for integer data, and lossy with a specifiable level of precision for floating point data.
Some time agio, I tried compressing the calexps and run into a lot of trouble. I will try to dig in my notes to retrieve the details but the resukts was that it was not usable for calexps.
The stack needs some work to support convenient on-the-fly FITS compression, e.g., we don’t want cfitsio scaling our floating-point data itself (because it doesn’t know about masked pixels). We can probably just adapt the code used in Pan-STARRS, which works fine. This is one of those things that we’ve always planned to do, but have just never gotten around to doing. Now that HSC’s data volume is growing rather rapidly, maybe it’s getting about time to do it?
These all seem like good points. But whether we can read tile compressed FITS images seems like a separate question to me, from how we actually manage compressed data.
I would compare the results of cfitsio fpack (which I believe the stack should read) with those from astropy.io.fits (which I think you’re using) to see what differences there are.
Ok, I’ll give that a try. Note that I did find and fix the bug in astropy that was causing it to write non-valid compressed files before, so the files astropy produces now pass fitsverify.
Following up on this: I ran fpack (no arguments) on the images (without zeroing them out), and got a similar compression ratio to what astropy gave me. jointcal did not crash, but also did not fit the images correctly, suggesting that something is still going wrong in ingesting the data.
For my particular usecase (generating small-size test catalogs for jointcal), I have found a solution by zeroing the images and then gzipping them and using the fact that cfitsio will read the .fits.gz files “automatically” when the butler requests the .fits file. This isn’t an ideal solution (it’s vaguely magical), but I’m only using it for my test data so it will do for now.
I guess I’ll file a ticket about reading tile compressed images and someone can try to create some tests to see what exactly is failing?
I didn’t realize what your actual problem was until reading @RHL’s other post. If all you really want is the image metadata including the binary-persisted ExposureInfo, you ought to be able to replace the image HDUs with single pixels rather than just replacing the current pixels with zeros. Then you shouldn’t need gzip. This will eventually be solved by persisting the ExposureInfo separately.
A common problem is trying to read NAXIS[12] from the header of a compressed image, but cfitsio doesn’t convert those when you read the image in. The Pan-STARRS FITS code includes some header manipulation to restore the header, which we may want to copy.
I can’t confirm that NAXIS/NDIM was the actual problem, though I would not bet against it, but neither a single 0 “pixel” nor a 2x2 0 pixel array were accepted.