DP1 butler access outside RSP

skoposov · July 1, 2025, 11:41am

Hi,

I was trying understand if I can access DP1 butler from outside the RSP.
I looked over https://dmtn-283.lsst.io/ and if i understood correct it implies that there is or there will be a possibility of doing that, but I could not find any examples of that.
My use-case is that I want to potentially fetch quite large parts/whole table from outside the RSP without TAP(as it introduces too much overhead), and my understanding is that it can be done efficiently through butler.

Is there a functionality for this ? If there is can you point me where to look ? Or alternatively, maybe there is a better way of fetching a large chunks of tables.

Thank you,
Sergey

MelissaGraham · July 1, 2025, 3:18pm

Hi @skoposov , thanks for your question. The RSP has been built to enable users to do their scientific analysis without bulk downloads, and the focus has been on building up the analysis tools. While DP1 is relatively small, the future datasets will be far too large to download. The ideal solution here would be to help you do your analysis in the RSP, if possible. Can we hear more about your science goals?

(Side note, in case you are an IDAC developer, those data transfers are arranged separately).

skoposov · July 1, 2025, 4:16pm

Hi Melissa,

I understand that there is the RSP and the motivation behind it.
I am however hosting semi-private database in Cambridge University with most major astronomical survey catalogs. I am obviously not planning to host the entire LSST data, but I am more interested in stacked catalogues, and maybe subsets of columns, hence my questions. Obviously right now with DP1 I can just easily fetch the catalogs by chunking the TAP queries, but it’s not ideal/scalable for the future. I also understand that is not the main advertised way of accessing the LSST data.

Sergey

MelissaGraham · July 1, 2025, 5:59pm

Hi @skoposov , thanks for clarifying your use case.

I must raise a warning here that including proprietary Rubin data in a database that is only semi-private sounds like it would be in violation of the Rubin data policy. A fully private database that was accessible only by Rubin data rights holders would be OK, but such restricted access poses a technical challenge for database maintainers, of course.

I have a couple of resources that might help here. We call third-party systems hosting Rubin data “Independent Data Access Centers” (IDACs). There are guidelines for IDACs, and ways for IDACs to delegate authentication and authorization of their Rubin data users to the Rubin Science Platform. This is not a trivial process and does require development work.

Hosting only the post-proprietary (>2 years past release) stacks and catalogs is also a potential future use-case for IDACs. The capacity for bulk data transfers is not unlimited, though, and the process here remains TBD.

Does this cover the kind of information you’re looking for?

skoposov · July 1, 2025, 7:26pm

Hi Melissa,

I am in full control of who gets access to the database and specific schemas, so the LSST data will be only accessible by specific people who have data-rights.

I will take a look IDAC policies/guidelines.

Thank you