DM Monthly Status Report for April 2020

michellepoland · May 22, 2020, 7:52pm

The DM monthly status report covering February activities has been posted to DocuShare, collection-1022. For convenience, the High-level Summary is pasted below. Direct link to the full report: https://docushare.lsst.org/docushare/dsweb/Get/report-1062/DM%20Monthly%20Progress%20Report%20202004.pdf

High-level Summary

The Data Management team continues to remain productive despite the transition to remote working due to the ongoing Covid-19 pandemic. While this has been challenging for everybody, the team has shown itself to be adaptable and ready to rise to the occasion.

As an example of this, the figure above shows the number of Jira tickets merged to the core Science Pipelines codebase since early 2019 as a function of time. While tickets are not a precise proxy for productivity, it is clear that momentum is being maintained despite the difficult circumstances.

Community Interactions, Meetings and Workshops

Members of the Data Management team supported the Joint Agencies Security Summit on April 6. The team also worked closely with our colleagues in the Rubin Pre-Operations Project to present the plans for Rubin Data Production at the Joint Agencies Review of Operations. Members of the team also attended the (virtual) South American Astronomy Coordination Committee Meeting, where they presented on the status of Rubin construction.

Technical Progress

Several documents were published or updated:

DMTN-108 was updated to reflect the discussion at the Joint Agencies Security Summit.

DMTN-120 was augmented with a substantial discussion of frameworks under consideration for persisting Rubin Observatory data products.

A new Conda and conda-forge-based distribution infrastructure for handling third-party dependencies for the Science Pipelines codebase was deployed. This both reduces the maintenance burden on the Project, and makes it easier for end users to install Science Pipelines code and integrate it with their own systems.

The definition of “visit” was decoupled from individual exposures throughout the Data Management middleware. This enables us to bring the code into line with the definitions used in project documentation, and will enable us to properly support the operational survey strategy.

Select Science Platform services were deployed on the Kubernetes cluster in La Serena to support ComCam activities at the Base Facility.

A number of key algorithmic enhancements were made during this month. In particular, we demonstrated execution of a complete solar system processing system prototype, including linking and initial orbit determination, on simulated data. This is a major step towards validating the ultimate Rubin Observatory solar system processing system.

We also deployed a substantially rewritten “decorrelation afterburner”, which provides substantially better results in image differencing; rolled out a new base class providing a standard interface to calibration products; developed a technique for mitigating amplifier-to-amplifier offsets; completed an analysis of DESC Data Challenge 2 data with the MultiProFit galaxy fitting tool; and upgraded fgcmcal with improved start-up performance and a “checkpointing” capability enabling runs to be restarted.

Precursor data from Hyper Suprime-Cam is now being regularly reprocessed and ingested into the Qserv instance at NCSA. A small slice of Dark Energy Science Collaboration DC2 data was also successfully ingested and subsequently published through the LSST Science Platform. This was all accomplished using newly developed ingest workflows & tooling.

In order to experiment with how the Rucio scientific data management system could be used within Rubin Data Management, a successful test of data transfer between Fermilab and NCSA was completed. Further, nine CCD (raft-scale) image data produced with proper headers from the ComCam systems on the NCSA test stand and then transferred to storage for the Science Platform.

Two major cuts were suffered on the fiber between Santiago and La Serena this month. Due to the service provider exceeding the contract time for the repair, the associated fines were issued. Fail-over to the secondary occurred with no loss of data.