Defects are changing
As of RFC-595 we have been attempting to formulate a system that allows defects (and potentially other user curated calibration products) to be handled in a more consistent manner.
When the word defect is used in this posting it refers specifically to strictly static, rectangular regions of pixels.
Examples are single hot/cold pixels, bad columns, and anomalous pixels at the edges the sensors.
To date, these have been represented in several different ways: text files, FITS binary tables of bounding boxes, and bad pixel masks.
This work is not intended to extend the types of defects, but does intend to reduce the numbers of ways they are encoded.
The target is to merge at the beginning of next week: 17 June 2019.
What is changing?
- We are standardizing on a single text format as the human readable version of defects.
- This human readable version is stored in a package separate from the package that holds configuration overrides and special case code.
- The standard text format files are ingested into calib repositories for use with command line tasks.
The standard format
-
The standard is to produce text files in the
ecsv
standard format.
This format provides both an easy to read and human readable format with support for metadata and machine readability throughastropy.table
. -
The standard is to store four columns: x0, y0, width, height.
These represent the location of the pixel nearest the origin and the extents in the x and y coordinates respectively. -
Three pieces of user curated metadata are required to exist in the file header.
These metadata must also agree with the layout of the files in the data package as described in the next section.
When using thewriteText
method on theDefects
object, the other required metadata will be added automatically.
Other arbitrary metadata may be added.
Specifically, we suggest adding aDEFECTTYPE
key.-
INSTRUME
: The instrument name e.g.decam
. -
DETECTOR
: The index of the detector to which to apply these defects. -
CALIBDATE
: An ISO compatible string that is unique for a particular level (typically detector) across all validity ranges. This cannot generically be assumed to be the valid start time. -
DEFECTTYPE
(optional): A string describing what kind of defects these are.
-
Note: As originally written, I had assumed one could use the valid start time as the CALIBDATE
. That cannot be enforced in general since different calibration pipelines define the CALIBDATE
in different ways. I have updated the description to reflect this.
Here is an example defects file from the obs_test_data
repository.
The obs data repository
A primary goal of this work was to provide guidance on how to break calibration-like data out of the obs_*
packages and into separate repositories.
The following guidelines are specifically for this work with defects, but should be extensible to other types of versioned, human curated data.
-
The package is the name of the
obs_
package appended with_data
.
E.g.obs_test
has a correspondingobs_test_data
. -
Each instrument in the
obs_
package should have a corresponding directory in theobs_*_data
package.
E.g.obs_subaru
has bothhsc
andsuprimecam
so will have each of those as top level directories inobs_subaru_data
.
These names must correspond to theINSTRUME
metadata in the defects files. -
Each type of curated calibration data will have a separate directory for each instrument.
In this case there will be adefects
directory for each instrument in a givenobs_*_data
package. -
Each sensor in the instrument array will have a directory containing a file for each validity range.
The directory name is the name of the sensor as given byDetector.getName().lower()
.
By convention, these will be lower case to avoid problems with case sensitive vs. non-case sensitive file systems.
An example in theobs_test_data
repository is:test/defects/0/19700101T000000.ecsv
.
In this case,Detector.getName()
returns the string0
andDetector.getId()
returns the integer0
, so the directory is ambiguous.
Most cameras return a more readable value for the detector name.
The same detector must be accessed by name via the directory name or by ID via the defect file metadata. -
The file name for individual validity ranges will consist of an ISO compliant date string corresponding to the beginning of the validity range and the
.ecsv
extension
The string date in the file name must correspond to theCALIBDATE
metadata in the file.
Using the standardized files
The standard text files are intended to be easy for a person to curate.
Specifically, they should be easy to sort by validity date on the command line with ls
.
It should be clear what file goes with which sensor.
Further, the format is meant to be editable for simple changes and easy to generate from code via Defects.writeText
.
In this form, the files do not constitute a proper calibration repository.
They must be ingested into a calibration repository to be used by the butler.
This is accomplished with the ingestDefects.py
command line task.
This task does a translation of the files from their text form to a binary form (FITS) and uses a Butler
to put them in the correct location.
A momentary diversion for three implementation details:
- For technical reasons, the files are written to a temporary location and then moved into the appropriate location for the calibration repository.
This means that the--mode
option is not available.
Specifically, it is not possible to use--mode=link
.
This shouldn’t be an issue for defects as they will be relatively small. - The representation of the defects inside the calibration repository is in the FITS region format.
This means they are parsable natively by bothastropy
andds9
. - Each defect file has a validity range extending to the next valid defect file. For the last in the sequence the
validEnd
is set to the end of Unix time.
The obs_
packages translated for this activity: obs_decam
, obs_lsst
, obs_subaru
, include a SConscript
that will ingest the defects from the appropriate obs_*_data
package into a temporary calibration repository inside the obs_
package.
These are not automatically discovered by the butler.
The defects can be used by other command line tasks via one of two options:
- The calibration repository created at
scons
time can be used to seed a calibration repository into which other products like flats may be ingested. - The defects can be ingested directly into an existing calibration repository.
An example command line forobs_test_data
defects is (note thatingestDefects.py
lives inpipe_tasks
):
ingestDefects.py path_to_butler_repository $OBS_TEST_DATA_DIR/test/defects --calib=path_to_calib_repo
.