Following is a cut at how we may try to bring the obs packages under control. This is not meant to make the obs packages “right” just to make them more homogeneous and easier to implement from scratch. I’m hoping for lively conversation to allow us to crystalize on a new design we can implement as a focused hack week later in the cycle.
What do obs packages currently contain?
Calibration information (Not including calibration images)
- Electronics (gain, read noise, overscan region, serial numbers, etc.)
- Camera geometry
Instrument specific data manipulation tools
- E.g. Native defect format -->
- E.g. Native defect format -->
Instrument specific task configuration overrides
Instrument specific task subclasses
bypass_*functions in the
CameraMapperare documented in the
CameraMapperclass, but not in the subclasses. This leads to cargo culting of possibly incorrect usage.
- Calibration primacy and reproduceability is not obvious. It is not always clear what should be used for calibrations or where the calibration data came from originally. There’s also the question of how to keep code and calibrations up to date with each other.
- Conflation of calibration information with code configuration is a problem because they change on different time scales and because one is a function of the data acquisition and the other is closer to a runtime decision.
Mapperis in limbo in the sense that it doesn’t belong concretely in either the DAX team or SciPi team sphere of responsibility.
- Ad hoc treatment: e.g. each obs package is using a different mechanism to transform calibration information from native format to the format needed by the stack.
- The bi-temporal problem – There is no way currently to specify any combination of calibration products and code to apply the products: i.e. “reduce data as if it was 1995” and “rereduce data taken in 1995 with the latest and greatest” are the two extremes.
- Split current obs packages into two git repositories each
- Calibration repository: This will be a git(-lfs) repository containing all calibration data. The repository will also contain code and tests to allow generation of the calibration repository at
- Configuration repository: This will be a git repository of largely configuration information: e.g. dataset definitions, config overrides,
Mappersubclasses. TBD is where the raw data ingest task overrides live. They could find a home in either repository.
- Provide defined mechanisms for manipulating and ingesting calibration data.
- Document clearly the non-calibration information. We should provide a cookbook for how to generate an obs package. This means clearly documenting which pieces are commonly (or necessarily) overridden.
- all calibration-like data in native format goes into a git repository specifically for holding these data.
- the calibration repository is built at scons time from the data in native format to solve the primacy issue
- discoverability is handled by valid date ranges in the calibration repository
- the calibration repository will be append only: i.e all versions of the calibration products will exist in the repo.
- The bi-temporal problem is naturally addressed by this design. At any time, a calibration repository of the entire history of the calibration products can be generated from the native formats. Git tags will need to be used to keep track of changes in how the calibrations are applied by e.g.
obs_basewill provide an ABC
Taskthat will have the methods necessary for building the calibration repository. This may require coming up with a way to map calibs to valid ranges.
class BuildCalibRepoTask(object): def run(self): self.make_defects.run() self.ingest_defects.run() self.make_linearity.run() self.ingest_linearity.run() ...
Note We could add the image like calibration data via multiple parents.
This is mostly documentation.
- Document what the “magic” methods do and how to use them.
- Move as many dataset definitions to
obs_baseand purge those not needed
- Document the process of subclassing the ingest tasks
- Identify common config overrides. Document required config overrides.
- Document required
VisitInfoattributes. This will involve a bit of policy making. I.e. what to do when a needed piece of
VisitInfois missing for a particular algorithm. This policy should be enforced in code where possible.