Reading the Rubin alert packets requires a schema that is not part of the packet, but rather found elsewhere. The schema can be fetched automatically by the reading code through a “schema registry” (SR option), or fetched by a human from somewhere, perhaps a git repo, and copied into the right place of the consumer code (GIT option). It is the nature of schema that they are occasionally changed, and that all parties should be given adequate notice when such changes happen.
SR means that the registry is a critical resourse: if millions of alerts are cached by a consumer, they cannot be read without the registry being up and running. I note that the current registry at slac.stanford.edu
has been down for a week (since July 11), so that no software development can be done by brokers during that time. I also note that if that software has automated execution of unit tests, nothing can run when the registry is down. Given recent unreliability of the SR, any SR should be at least mirrored in several places.
A consistency problem arises if the schema is kept in other places – the consumer worries that the versions are different. In particular, I note that the GIT version of the 7.1 alert schema has doc strings, but the schema from the SR does not have these. Notice that the provision of SR mirrors adds to the worries about inconsistency.
One of the advantages of the SR is that changes can be slipped in and the reading software will magically parse the new packets correctly, without any human in the loop. However, there may well be subsequent processing that fails in spite of this. For example, if the attribute filterName
is changed to band
in the schema, everything will be great until downstream, when the packet is inserted in a database table, or used to make a plot. Whether we use SR or GIT, we humans must be properly informed of upcoming changes, which seems to rather preclude a main advantage of SR.
For myself, I would be happy to forget about confluent_kafka.DeserializingConsumer
and go back to fastavro.schemaless_reader
.