Safer handling of long FITS keywords

While testing an beta of cfitsio 3.38, I discovered that it broke our use of the HIERARCH keyword to handle keywords longer than 8 characters or those that have non-valid characters (e.g. “.”). HIERARCH is not defined in the FITS standard, but it is a convention originally defined by ESO for chaining keywords into a hieararchy:

That convention implies that using it to define single long keywords may not work (the first value in a HIERARCH is the “namespace”), but it doesn’t explicitly forbid it. On the other hand, the cfitsio documentation suggests using HIERARCH to store long keywords (according to Bill Pence, that support was added in the early 2000s):

https://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/f_user/node28.html

and pyfits recently (~the last year) added similar support for HIERARCH for long keywords:

http://docs.astropy.org/en/stable/io/fits/usage/headers.html#hierarch-cards

As I said at the top, such usage is not defined in the FITS standard, and the ESO HIERARCH convention implies that it shouldn’t be used in that way.

Here’s what Bill Pence suggested, after giving this warning:

However, I cannot guarantee that support for these types of keyword will continue indefinitely, and certainly there is no guarantee that other FITS software will be able to read or write these keywords.`

My recommendation on what to do depends on how many of these types of keywords you intend to use. If it is only a hand full of keywords, then the simpler path would be bit the bullet and convert them to standard 8 character keywords. Otherwise, would it be possible to change them so that they conform to the original ESO HIERARCH convention? For example, instead of

HIERARCH reallylongkeyword = 1
HIERARCH A.B.C.D = 1

use

HIERARCH LSST REALLYLONGKEYWORD = 1
HIERARCH LSST A B C D = 1`

We could certainly use the latter form (the LSST namespace) for all the places where use use long keywords, but I think we may have some “transparent” handling of long keywords at present that may cause trouble. Here’s one example:

I’m not sure what the best solution here is, but we should probably take care of this soon, before we start producing lots of non-compliant FITS files that may require special handling in the future.

I think we will need to properly define our FITS serialization and metadata standard. That’s definitely something that has to happen. Currently the files seem to be defined in a adhoc manner but this can’t be the long term strategy.

If the ESO definition of HIERARCH is likely to live on, then it would probably be safest to start writing out compliant keywords. We could include backwards compatibility code to read files with the current non-compliant keywords, with the likelihood of removing that at some point.

By the way: I’m pretty sure pyfits has supported HIERARCH for a long time. But the support did change at some point so that it automatically adds HIERARCH for long keywords (with a warning) rather than rejecting the attempt with a message saying “you must use HIERARCH”.

I think we’re pretty safe here. We don’t use HIERARCH explicitly in most places, rather relying on cfitsio. As this has been pretty common practice for 10 years and isn’t inconsistent with the ESO proposal if we choose the namespace token to be “”, which is I think technically allowed:

The name space token has the value “ESO” for all the hierarchical keywords defined within that organization; a different unique domain name should be defined by any other organizations that uses this convention. (Currently, it appears that ESO is the only organization that uses this convention).

I’d be amazed if cfitsio dropped support. If we care about this (beyond documenting it) we should probably go ahead and either register the null keyword as an LSST extension, or get official acknowledgment that this is legal.

Why wouldn’t we use “LSST” as the namespace token?

We’re currently using HIERARCH to get around the fortran-66 headers that FITS uses, rather than to provide LSST specific keywords. If we prefixed all long names with LSST everyone else in the world would have to learn to ignore the “LSST”, and we wouldn’t be able to use it in places that we really did want a namespace.

Not very convincing, I agree. But I think the probability of getting into trouble over this issue is vanishingly small and doing nothing (equivalent to using “” as our namespace) is almost certainly safe.

Actually, the whole reason this came up is that the cfitsio 3.38 beta did change how it supported non-namespaced HIERARCH keywords, resulting in our long keywords either not being written, or being written without the HIERARCH preface.

Can’t we “just” make a FITS convention that does what we have historically wanted HIERARCH to do, using a different keyword?

There are currently several competing proposals for managing >8 character keywords in FITS headers. They will be under review in the near future, and we could choose to advocate for one of them once the written proposals come out, but I don’t think we should create another until then.

1 Like

None of the proposals for long keywords made it into the new version of the standard:

Also, cfitsio 3.39 has seemingly reverted the “fix” to the HIERARCH parsing.

For the record: this code is in formatFitsProperties, a function that is only used for one thing in our stack: feeding code that makes a WCS from a FITS header.

It appears that the code that writes our FITS headers for Image.writeFits and Exposure.writeFits calls fits_write_key from cfitsio, and that presumably does something sensible with long keywords. The high-level function is Fits::writeMetadata (whose doc string is mistakenly that for readMetadata).

formatFitsProperties should probably be replaced by Fits::writeMetadata or rewritten to work like it. The only thing wrong with using Fits::writeMetadata “as is” is that it does not not output NAXIS[12] and WCS constructors do like to have that if they can.

Following a discussion at ASASS2017 and following up with Jessica Mink, I think that the status is that there’s text:

4.1.2.1.b. Long Keyword name (bytes 1 through n)
The long keyword name shall be a left justified, n-character, ASCII string. Single embedded spaces are allowed. Leading and trailing spaces are not allowed and if present shall be ignored. Multiple consecutive embedded spaces are not allowed and if present shall be collapsed into a single blank. All digits 0 through 9 (decimal ASCII codes 48 to 57, or hexadecimal 30 to 39), upper case Latin alphabetic characters ‘A’ through ‘Z’ (decimal 65 to 90 or hexadecimal 41 to 5A) as well as their lower case equivalent ‘a’ through ‘z’ (decimal 97 to 122 or hexadecimal 61 to 7A) are permitted; lower case characters shall be considered equivalent to upper case ones (i.e. keyword names are case insensitive) and are used only to improve legibility. The underscore (‘ ’, decimal 95 or hexadecimal 5F), hyphen (‘-’, decimal 45 or hexadecimal 2D), dollar sign (‘$’, decimal 36 or hexadecimal 24), dot (‘.’, decimal 46 or hexadecimal 2E), and colon (‘:’, decimal 58 or hexadecimal 3A) are also permitted. No other characters are permitted. For indexed keyword names see previous section on short names.
OPTIONAL TBD The presence of long keyword names in a header shall be signalled by a keyword If LONGKYWD = ’1.0’ (or is it LONGKYWD = 1.0 ?).

I have asked that this be augmented with the statement

If a long keyword begins with the string “HIERARCH” followed by one or more spaces they must be removed, so
HIERARCH XXXXXXXXXXXX = VVV
is equivalent
XXXXXXXXXXXX = VVV

as then we could use long keywords now, and all the fits readers will already support a very useful subset of the new standard.

Jessica replied:

I think that is a way I could convince the Committee to go.