Generating custom catalogs from a small set of visit images and their templates

deppep · July 30, 2025, 3:29pm

Dears,

First, I want to thank everyone who participated in the work and writing of the DP1 paper. I found it extremely well written and informative.

We are working on a pipeline for searching optical counterparts to high-energy transients. For this purpose, we have prepared a set of visit images injected with fake sources, initially using data from DP02 and now from DP1. We are using these images to test custom detection pipelines that will be fine-tuned for the sources of interest in our research.

Our approach so far has been to take bits and pieces from the LSST pipeline (e.g., routines from the diffim package) and use them to produce custom catalogues, tables, and data structures to be later searched for interesting transients. This has some unfortunate consequences:

We are writing a lot of code that is probably reinventing the wheel and is neither as efficient nor as battle-tested as what is already in place.
Our data products are not conformant those produced by LSST, which means we need to develop separate analysis pipelines and glue code to handle both our custom data products and the official LSST data products

Now my question: is it feasible to run the whole DIA pipeline on a (small-sized) set of user-provided visit and template images, using custom configurations, and receive as output fresh (and small-ish) versions of, at least, the DIASources, DiaObjects, and ForcedSourceOnDiaObject catalogs?

If this is achievable, does anyone have a pointer on where to start looking for a hook?
I have checked the docs, the official DP02/DP1 tutorials and the community forum search but could not find anything answering this question.

Thank you so much,
Peppe

timj · July 30, 2025, 3:37pm

Yes, although you need to start using butler. There are other people investigating using their own data in the Rubin pipelines. See for example the discussion here:

It’s a bit tricky to set up a butler where you haven’t started with raw exposures and want to go straight to processed visits, but it is possible, albeit we don’t really have any tooling for that to calculate the detector regions for you from the FITS WCS, for example.

If you look at the software paper (PSTN-019) you will see a discussion of a lot of the issues, including fake source injection.

deppep · July 31, 2025, 8:54am

Thank you Tim. I will check out on creating custom datasets and collections with the butler, and the problem you mention with WCSs. Will probably come back for a few questions more.

Ciao,
Peppe

timj · July 31, 2025, 9:19am

The process to do all this starting from raws does have some documentation. The problem with starting from a visit is you have to define the visit dimension records manually to allow your files to be ingested. You also need to worry a little bit about how compatible your FITS files are with what the downstream pipelines expect in terms of calibration extensions that you likely don’t have in your files.

MelissaGraham · August 14, 2025, 4:47pm

Hi @deppep, thanks for these questions.

I just want to add a few more points to what Tim has said, so that we can potentially mark this Support request as solved.

It sounds like you’re looking to start with processed visit images (not raws, and not your own FITS), inject fake sources, and re-run difference imaging and source detection. As Tim says, yes this is feasible (in part now, and in full in the future), and you’ll need to use the butler.

The most relevant DP1 tutorials that demonstrate how to do these kinds of things are in the series of tutorial notebooks “105. Image reprocessing”. In particular, 105.6 sets up a custom butler to store pipeline outputs. Executed versions of these tutorial notebooks are available in the DP1 documentation, and executable ipynb files are available in the Notebook Aspect of the RSP.

However, as you probably noticed, there is not a DP1 tutorial for difference imaging. It is not possible to do this with the DP1 dataset (but will be in the future), for reasons described in the topic “Issues with Image Subtraction”.

I think that wraps up the answers for what is feasible and where to look for demonstrations, so I’m going to mark this post as the solution for this topic. But please do feel free to start a new topic for new questions or issues related to your project, any time.

deppep · September 2, 2025, 11:51am

Hello @MelissaGraham, sorry for the late reply! I understand. I will check out the notebook from the 105 section and eventually come back with a new thread. Thank you a lot!