AVRO schema wrangling

The Lasair project is starting to be serious about AVRO and its schema.

  • Where can I find the latest schema (.avsc) for the alert packets?
  • How does the schema registry work?
  • How can the community brokers find out that a new schema version is coming?
  • Will the production schema have the “doc” attribute that describes each attribute in human language, so we can pass it on to our users?

Hi, Roy! I am the forum watcher this week, but I admit I have no background on this particular topic. I see that @ebellm answered you when you asked similar AVRO-based questions back in January; so I’ve tagged him. I will also check around and get back to you a bit later today. Thanks!
Best regards,
Douglas

Roy, I asked on the #desc-alerts-topical-team Slack channel, and @kessler pointed to this link: link . (Many thanks, Rick!). It looks like this may answer your first two questions, which is a good start. I will see if anyone can answer your other two questions.
Thanks!
Best regards,
Douglas

Additional: It looks like @ebellm 's message from January may answer your 3rd question:

Also, @rknop notes in the #desc-alerts-topical-team Slack channel:
Note that those [schemas in link] are the schema for ELAsTiCC, which are not the same as the schema for LSST itself. ELAsTiCC is not using a schema registry.

Re: knowing when there are new versions, specifically with regard to ELAsTiCC, the best thing to do is to be in the #elasticc-comms channel. We will also keep the latest schema in that github archive Rick linked, which is also linked from here: The DESC ELAsTiCC Challenge"

Again, all of that is ELAsTiCC specific. I’m not sure if Roy was asking about alerts for ELAsTiCC, or for LSST in general.

Hi @roy:

1 Like

Thank you Douglas and Eric!

Re. a schema registry, I’d suggest it’d be useful for the project to coordinate with the scimma.org team and GCN/TACH so there is one solution with community buy-in, in the same way AVRO itself was adopted. Particularly for projects where the schema itself is unlikely to evolve rapidly, a solution like Confluent seems overly complicated and potentially expensive for smaller projects.

Hi @gnarayan Always happy to talk. Those projects don’t have the bandwidth constraints that Rubin does and have more heterogeneous messages: they almost certainly should just embed their schemas in their messages. I agree that the schema registry adds complexity, maybe unnecessarily, but it doesn’t preclude other means of disseminating schemas. The schema registry is part of the open-source portion of the Confluent platform, though, and in any case it’s just a REST API–there’s no cost for clients.