Users Committee Report (Dec 13, 2024) and Response (Jan 14, 2025)

This report reflects discussions at both Rubin Community Workshop 2024 and the User Committee (UC) meeting held on October 28, 2024.

The meetings started with a listening session. Members from the community were invited to share their views, questions, and possible problems that they encountered. This was followed by a public UC discussion and then a private UC discussion.

Findings:

  • In DP0, large tables are indexed by coordinates. However, queries that rely on other columns are slow. If a document exists that describes the database in detail and which indexes exist, it would be useful if it is made easy to find.

  • We recommend implementing secondary indexes to enable faster queries (e.g., for magnitude and redshift). We recommend that a system is put in place so that members of the community can request new indexes seamlessly, in a way that makes it straightforward for the Rubin Team to assess feasibility and proceed to implementation.

  • The UC found that more can be done to disseminate the API capabilities of the Rubin Science Platform (RSP), as well as RSP updates. We recommend adding the UC to the recipients of information about relevant updates that should be communicated broadly, so that the UC can be of help in this.

  • The broker page (Alert Brokers | Rubin Observatory) needs updating. We also recommend the new survey strategy page (https://survey-strategy.lsst.io/) to be made easier to find by providing evident links to it in multiple places. For example, it would be good to have a direct and more visible link to it from For Scientists | Rubin Observatory. We recommend this page in particular to be kept up to date and include all the information needed for a person joining the Rubin community to get started.

  • We recommend that a person from the Rubin Team is appointed in charge of overseeing the documentation, for example making sure that the survey strategy page is up to date. The UC is willing to work together with the Rubin Team to identify which pages should be prioritized.

  • Some users make use of Community Forum extensively. We recommend encouraging people who raise issues that the Rubin Team is addressing to start a topic in Community, so that they can receive direct feedback there on the work done.

  • In the future, the UC should let the community know a few weeks in advance when requesting for feedback

  • The UC has discussed the growing concern regarding data storage and scalability of the RSP. Groups are starting to think of resources needed, but the availability of computational resources (e.g., IDACS) is unclear. The UC was referred to the latest Science Advisory Committee (SAC) report (https://project.lsst.org/groups/sac/sites/lsst.org.groups.sac/files/Rubin%20SAC%20report%20Aug2024.pdf) and made aware that a dedicated Resource Allocation Committee will be established. We stress the importance of transparency on available resources and detailed numbers as soon as they become available.

  • The UC acknowledges the community interest in engaging with the LSST Education and Public Outreach (EPO) team. Specifically, there is a need for a places that work as forums for the public to post thoughts and ideas. The UC has identified some existing resources, such as:

1 Like

Rubin Observatory thanks the UC for their 2024B report and has prepared responses that detail the actions taken (or to be taken soon) as a result.

Finding 1

In DP0, large tables are indexed by coordinates. However, queries that rely on other columns are slow. If a document exists that describes the database in detail and which indexes exist, it would be useful if it is made easy to find.

Response

We agree that information about how the DP0 tables are sharded and indexed was not easily discoverable (e.g., did not appear in the schema browser or in the DP0.2 DPDD, nor when browsing a table in the Portal, though it was demonstrated in query instructions and tutorials). The UC is right to point out that this information should be easier to find.

For a start, for DP0, a section about table sharding and indexing has been added to the DP0.2 DPDD page (under Catalogs). There is also a new ticket to clarify column indexing in, e.g., the schema browser (ticket SP-1816).

To summarize the newly-added section referenced above:

The catalog database can be thought of as the database being divided up by spatial region (shard) and distributed across multiple servers. Spatial constraints that minimize the number of shards searched through are much faster than queries which have no (or very wide) spatial constraints. There are three “table indices” columns (objectId, diaObjectId, and sourceId) that can be thought of as encoding information about the object’s shard, and queries that include constraints on these columns are also faster.

Queries that do not make spatial constraints must access every shard of the database and will always be significantly slower (e.g., all-sky queries by photometric or time-domain characteristics). More on this in the response to Finding 2, below.

Finding 2

We recommend implementing secondary indexes to enable faster queries (e.g., for magnitude and redshift). We recommend that a system is put in place so that members of the community can request new indexes seamlessly, in a way that makes it straightforward for the Rubin Team to assess feasibility and proceed to implementation.

Response

We appreciate that the UC would like to see users provided with means for faster queries. However, because the data are distributed spatially across servers (as described above), queries over non-spatial columns will still perform significantly slower than area-restricted queries even if additional indices on measurement columns are provided. The Qserv team will continue to work on optimizing high-selectivity non-spatial queries, though it might not be through indices per se, and we do want to continue to hear about common queries and user expectations. Towards that end we would encourage the community to continue to report their experiences via the Forum or via the UC.

Finding 3

The UC found that more can be done to disseminate the API capabilities of the Rubin Science Platform (RSP), as well as RSP updates. We recommend adding the UC to the recipients of information about relevant updates that should be communicated broadly, so that the UC can be of help in this.

Response

The CST has work plans in place already that are relevant to this finding. Regarding “disseminate the API capabilities”, we will be filling in the basic API information at rsp.lsst.io/guides/api (ticket SP-1604). Regarding “RSP updates”, we will be creating a log of RSP updates at rsp.lsst.io (ticket SP-1648), similar to the existing logs that CST maintains for updates to the tutorials (e.g., the log for DP0.2 tutorials). Major RSP updates would also be advertised in, e.g., the biweekly Rubin Digest posted to the Community Forum. The UC will be able to use these two resources for more frequent RSP updates but we would also be happy to announce minor updates to UC members via Slack or their mailing list.

Finding 4

The broker page (lsst.org/scientists/alert-brokers) needs updating. We also recommend the new survey strategy page (survey-strategy.lsst.io) to be made easier to find by providing evident links to it in multiple places. For example, it would be good to have a direct and more visible link to it from lsst.org/scientists. We recommend this page in particular to be kept up to date and include all the information needed for a person joining the Rubin community to get started.

Response

We think the new “For Scientists” webpages at rubinobservatory.org/for-scientists, released by the CST on Mon Dec 16, satisfy this finding. We have more updates planned through early 2025 (ticket SP-1569), but don’t hesitate to reach out with typos, mistakes, comments, etc. The “old” pages, lsst.org/scientists, will remain as we finish migration, then we’ll start to add redirects and deprecate the old pages in early 2025.

Finding 5

We recommend that a person from the Rubin Team is appointed in charge of overseeing the documentation, for example making sure that the survey strategy page is up to date. The UC is willing to work together with the Rubin Team to identify which pages should be prioritized.

Response

All technical documentation is already owned by Rubin teams. The survey strategy documentation at survey-strategy.lsst.io is owned by the Survey Scheduling team within the System Performance department. The survey-strategy.lsst.io pages have been updated to reflect the v4.0 baseline, including the updates to the baseline summary page at survey-strategy.lsst.io/baseline/changes.html. As future updates are released and announced on Community, additional information will be added to the changes page.

Finding 6

Some users make use of Community Forum extensively. We recommend encouraging people who raise issues that the Rubin Team is addressing to start a topic in Community, so that they can receive direct feedback there on the work done.

Response

Asking people to start new topics in the Community Forum is the primary mode of user support for Rubin staff. This was communicated to Rubin staff first in the Interim Model for Community Support (DMTN-122), and the Forum is the core of the Community Science Model (RTN-006). We agree that the UC and all users should be encouraging each other to use the Forum and thank the UC for raising this.

Finding 7

In the future, the UC should let the Community know a few weeks in advance when requesting for feedback.

Response

N/A, as this UC finding applies to the UC.

Finding 8

The UC has discussed the growing concern regarding data storage and scalability of the RSP. Groups are starting to think of resources needed, but the availability of computational resources (e.g., IDACs) is unclear. The UC was referred to the latest Science Advisory Committee (SAC) report and made aware that a dedicated Resource Allocation Committee will be established. We stress the importance of transparency on available resources and detailed numbers as soon as they become available.

Response

We agree that the scope of the IDACs and the Resource Allocation Committee (RAC) remain unclear, and that their existence is largely unknown to the broader community at this point. This is largely because they are still in development. In 2025, the CST will be working with NOIRLab to help establish the RAC; will be working with IDACs to understand and help document their resources; and will update the new For Scientists website accordingly. (Some relevant tickets include SP-1635, SP-1636, SP-1638, SP-1758, and SP-1725).

Finding 9

The UC acknowledges the community interest in engaging with the LSST Education and Public Outreach (EPO) team. Specifically, there is a need for places that work as forums for the public to post thoughts and ideas. The UC has identified some existing resources, such as:

Response

We thank the UC for raising this and additionally point to the Rubin Community Forum’s “EPO” category, which is the primary venue for EPO-related discussions. (In the Forum, tags are used just to add keywords to topics posted in different categories.) We’d also like to take this opportunity to highlight that the Rubin EPO team is collecting the names and emails of scientists interested in being among the first to run Citizen Science programs with LSST data via this form.

The EPO team’s formal education program has also developed a large community of practice, with both social media (Facebook) and email discussion group as well as teacher training workshops. This community of practice has already demonstrated a strong exchange of thoughts and ideas. The Facebook group can be joined by searching “Rubin Observatory Educators” and the mailing list is joined by emailing education@lsst.org.