Questions and impressions from a complete amateur who just completed all 61 DP1 Tutorial Notebooks

JohnStorgion · August 11, 2025, 1:44pm

Hello, I’m a complete novice/enthusiast who is simply interested in astronomy and inspired by what Rubin/LSST is doing. I have background in CS, Mathematics, and a little bit of Data Analytics, and so when I saw I had the ability to register with the Rubin Science Platform, I jumped at the opportunity, and hurled myself headfirst into the Notebook aspect of the RSP.

I have since spent the last 1.5 months chipping away at completing every single tutorial in the DP1 dataset as a side-project in my free time. I finally finished up this morning. I have learned a ton, and have gone from absolutely no experience with astronomical data to at least a baseline understanding of concepts like PSF, flux, red-shift, magnitudes, and the general way Rubin actually detects and processes images.

However, this has also absolutely humbled me and made me realize how much I don’t know. As such I thought I’d ask the community some of the things that still aren’t clear to me after 61 tutorials, and ask for advice on next steps/interesting areas to contribute or continue my learning.

Some top questions:

What is the difference between a “Source” and an “Object”? I had a hard time understanding if sources had multiple objects or objects had multiple sources… or is it both? The overall hierarchy of the different detections is still pretty hard for me to keep track of.
There are many tutorials that deal with “Forced” photometry and PSF, as opposed to… “not forced” I suppose? What is meant by “forced” in this context and how does that affect how the data is handled?
I’m curious as to what is meant practically when “errors” and “estimations” come up. How can the telescope tell it made a mistake? Is this to account for atmospheric conditions or lensing or redshift, things that would warp the image? I naively thought the telescope simply took pictures of what it could see and that was that, but clearly there’s a lot more sophistication happening under the hood.
Given that fully processed coadd images are still in black and white, how are the fully colored/stitched together images available to the public (like in skyviewer.app) created? I suppose they are “artists renditions” in regards to coloring but I’m definitely curious to learn more about that final processing step.
Where can I go from here? I’d love to learn how to use the API aspect, or start doing actual analysis or science with what I’ve learned, but I have no idea where to begin. Are there other projects to contribute to? Courses to better learn the concepts from the tutorials that assume knowledge I don’t have? Recommended next steps?

Thanks for your time and I’m looking forward to following all the discoveries the LSST community makes!

John S

dtaranu · August 11, 2025, 10:52pm

Hi John,

You might be interested in this page on Data Science with LSST, which includes a link to the glossary and excerpts the definition of Object and Source from there.
Forced generally means a measurement that has been run with some of the free parameters fixed, i.e. photometry using a fixed centroid/position from another earlier measurement.
There are always uncertainties in measurements because the images are noisy (mainly read noise in the detectors and “shot” noise from randomness in the number of photons that are detected in each pixel). There are also many more sources of systematic uncertainties that are harder to estimate and are not all propagated into the errors in measured properties/parameters.
The method used in skyviewer is not quite fully documented yet, but Lupton et al. 2004 has an explanation of… essentially its predecessor, which is used quite commonly in astronomy.
That’s a good question that I don’t have an answer for.

jscargle · August 11, 2025, 11:56pm

John, it is great to hear this story of your involvement. I can try to answer your Rubin.excellent questions from a general point of view – others can provide input more specific to Rubin.

The terms “source” and “object” are often used interchangeably. But roughly, source tends to mean something in data (a blob at such-and-such point in an image is a source) and object tends to mean the reality (the nebula M1 is an object). But this distinction is far from rigid …
Ahhhh! This is a big, great question. Simply stated, any observation falls short of reality. The difference between observation and reality is the “error” … a great deal of data processing is devoted to attempting to extract a better representation of the truth, correcting as much as possible for errors in the measurements. Statisticians elevate this process by calling it “estimation” … So I think you have the basic idea.

MBilicki · August 12, 2025, 5:06pm

If I could add a bit more on “forced photometry”. Typically it will mean measuring e.g. fluxes of astronomical sources in a given passband (photometric filter) using some information from another, usually “better”, passband.
For example: we have one passband that has the best seeing and/or signal-to-noise properties among several passbands that our telescope observes in. We start by detecting sources, and measuring their properties such as sky location, flux and size, in that best passband. Then we use the resulting source list with locations, and perhaps size information (for extended sources), to measure fluxes in the other, “worse” passbands. We “enforce” photometric measurements in these other passbands.
“Not forced” (or single-band) photometry would typically mean independent source detections and measurements in several individual passbands as if the others weren’t available.
The forced photometry approach is important if we need multi-band measurements for our sources even for the cases when they could have very low signal-to-noise, or even be undetected, in some of the passbands. Non-detection, or upper limit on the flux, in some of the bands, is often also information. For instance, distant galaxies have their light “redshifted” which will result in them getting fainter in the bluer bands than if they were nearby, while still being bright in the redder bands. Such a “dropout” method is often used to search for high-redshift (i.e. very distant) galaxies.

JohnStorgion · August 15, 2025, 1:11pm

Thank you Jeff! Are there further resources you’d recommend I could use or take a look at for better learning these concepts? The amount of information out there is definitely a little overwhelming.

JohnStorgion · August 15, 2025, 1:11pm

Thank you for the response Dan! I’ll take a look at other areas and try to figure out other ways I can contribute and learn. Very excited to be on the ground floor of what Rubin is going to discover.

plazas · August 18, 2025, 1:23pm

Dear @JohnStorgion, some of the tutorials have references to technical papers or notes (e.g., https://rtn-095.lsst.io/ and https://pstn-019.lsst.io/ for DP1 and the LSST Science Pipelines). You could also consider joining one of the LSST Science Collaborations to do research with the LSST data (some of them require LSST Data Rights [Data Policy | Rubin Observatory] to join): LSST Science Collaborations | LSST Discovery Alliance, although you don’t necessarily have to join a SC to use LSST data.