Recreating the LSST Science pipeline tutorial (gen 2) only using Generation 3 command line tasks and the pipetasks

This command makes a skymap based on the calibrated exposures it finds in the collection. You have to specify a collection that contains calexp datasets. The tutorial explains some of what a discrete skymap is:

https://pipelines.lsst.io/getting-started/coaddition.html?highlight=discrete#making-a-sky-map

(currently gen2 but the concepts are the same). You can process some raws into calexps, and once you have them (maybe you already do) you can generate a sky map based on them.

You can though also define a full skymap and we use different ones for HSC and DC2 data.

Geeze, okay, perhaps I need to read more closely, because the html file does state:
Before you can run the coaddtions pipe task you have to run the make-discrete-skymap command line task
Emphasis on “…command line task.…”!!!
Now, I’ll have to decide if I ALSO need to run the butler command that follows in the .html file…

“Command line task” is commonly thought of as a gen2 concept and in the link I sent you for skymaps it is the gen2 version (will change once the new tutorial exists). I sent you there to learn about discrete sky maps, not to run any of the commands on that page.

If you are talking about the HTML tutorial from @joshuakitenge then I see that it does use “command line task” when it should be saying “subcommand”. We refer to the butler and pipetask commands as having subcommands that do specific things (much like git log where log is the subcommand for git). So “command line task” is confusing there for people who are coming from gen2 where it has a very concrete meaning.

Okay Tim. Thanks for the comments. I was suspicious that the command line option did not apply.
So, I’m still digging around to understand my situation. All steps have run flawlessly up this point.
After the pipetask processCcd, I can see it generated a batch of calibrated exposures:
/Users/fredklich/Downloads/lsst_stack/GEN3_run/processCcdOutputs/calexp/
./20130617/r/HSC-R/903346/calexp_HSC_r_HSC-R_903346_1_54_processCcdOutputs.fits
./20131102/i/HSC-I/903988/calexp_HSC_i_HSC-I_903988_0_30_processCcdOutputs.fits and 31 more.
Repeating, it does not show up in the query-collections list.
However, I do see it amongst the list of DatasetTypes.

butler = dafButler.Butler("/Users/fredklich/Downloads/lsst_stack/GEN3_run",collections=‘calexp’)
registry = butler.registry
for x in registry.queryDatasetTypes():
print(x)
+++++++++++++++++++++++++++++++++++++++++++++++++
DatasetType(‘ps1_pv3_3pi_20170110’, {htm7}, SimpleCatalog)
DatasetType(‘raw’, {band, instrument, detector, physical_filter, exposure}, Exposure)
DatasetType(‘camera’, {instrument}, Camera, isCalibration=True)
DatasetType(‘defects’, {instrument, detector}, Defects, isCalibration=True)
DatasetType(‘bfKernel’, {instrument}, NumpyArray, isCalibration=True)
DatasetType(‘transmission_optics’, {instrument}, TransmissionCurve, isCalibration=True)
DatasetType(‘transmission_sensor’, {instrument, detector}, TransmissionCurve, isCalibration=True)
DatasetType(‘transmission_filter’, {band, instrument, physical_filter}, TransmissionCurve, isCalibration=True)
DatasetType(‘transmission_atmosphere’, {instrument}, TransmissionCurve, isCalibration=True)
DatasetType(‘icSrc_schema’, {}, SourceCatalog)
DatasetType(‘packages’, {}, Packages)
DatasetType(‘isr_config’, {}, Config)
DatasetType(‘characterizeImage_config’, {}, Config)
DatasetType(‘src_schema’, {}, SourceCatalog)
DatasetType(‘calibrate_config’, {}, Config)
DatasetType(‘postISRCCD’, {band, instrument, detector, physical_filter, exposure}, Exposure)
DatasetType(‘icExp’, {band, instrument, detector, physical_filter, visit_system, visit}, ExposureF)
DatasetType(‘icSrc’, {band, instrument, detector, physical_filter, visit_system, visit}, SourceCatalog)
DatasetType(‘icExpBackground’, {band, instrument, detector, physical_filter, visit_system, visit}, Background)
DatasetType(‘isr_metadata’, {band, instrument, detector, physical_filter, exposure}, PropertySet)
DatasetType(‘isr_log’, {band, instrument, detector, physical_filter, exposure}, ButlerLogRecords)
DatasetType(‘characterizeImage_metadata’, {band, instrument, detector, physical_filter, visit_system, visit}, PropertySet)
DatasetType(‘characterizeImage_log’, {band, instrument, detector, physical_filter, visit_system, visit}, ButlerLogRecords)
DatasetType(‘srcMatch’, {band, instrument, detector, physical_filter, visit_system, visit}, Catalog)
DatasetType(‘srcMatchFull’, {band, instrument, detector, physical_filter, visit_system, visit}, Catalog)
DatasetType(‘src’, {band, instrument, detector, physical_filter, visit_system, visit}, SourceCatalog)
DatasetType(‘calexpBackground’, {band, instrument, detector, physical_filter, visit_system, visit}, Background)
DatasetType('calexp’, {band, instrument, detector, physical_filter, visit_system, visit}, ExposureF)
DatasetType(‘calibrate_metadata’, {band, instrument, detector, physical_filter, visit_system, visit}, PropertySet)
DatasetType(‘calibrate_log’, {band, instrument, detector, physical_filter, visit_system, visit}, ButlerLogRecords)
+++++++++++++++++++++++++++++++++++++++++++
So somehow I have to make these calibrated exposures to be visible as a “collection”.
thanks to you or anyone else who might offer a comment.

When you ran the pipetask command you must have had to specify an output collection for the products from that command. That collection is the collection you should use to find the calexp datasets. From the path you are reporting it is possible that you called the output run processCcdOutputs – use butler query-collections to list the collections.

It is unlikely that you called the collection calexp although not impossible. calexp is the dataset type as you later demonstrate. The dataset type describes the dimensions and storage class that are relevant for the dataset and gives it a unique consistent label that pipelines can use to locate products.

You will be happy to learn that the official gen3 tutorial came out this morning:

https://pipelines.lsst.io/v/weekly/getting-started/index.html

Okay, Tim, I’ll work a little more on this thread to see what I can learn. Interesting, as I showed my query-collections output earlier I noticed no obvious calexp collections.
My output was specified as –output-run processCcdOutputs, which is where I see a pile of calexp .fits files.
Hold the phone, though. As I was following Joshua’s pipets command, it did include this set of -c options:
-c isr:doBias=False -c isr:doBrighterFatter=False -c isr:doDark=False -c isr:doFlat=False -c isr:doDefect=False
Blame me for not seeing this earlier. In essence this is explicitly bypassing the Flat, Darks, Bias and Defect, so no wonder we don’t have a calexp? Or maybe I don’t have the proper grasp here…
Okay, Tim, I’ll go do a pre-review of the gen3 tutorial, expecting it will execute nicely.
thanks

That is the output run where the calexp datasets will be. In previous posts you were seemingly using collections="calexp" which is not going to show anything. Use butler query-datasets REPO calexp to list all the calexp files.

okay, great, below, I do see calibrated exposures genned by processCcd steps…still unclear why these are not reflected as a “collection” .
Is there a step to define these as a collection with a proper collection “name”. In other words, I’m trying to get the collection name correct so my make-discrete-skymap butler command will work…(my original issue).
Thanks for all your patience…I’m learning a lot.
Hope you have a great holiday weekend.
Fred klich

butler query-datasets ./GEN3_run calexp

type run id band instrument detector physical_filter visit_system visit


calexp processCcdOutputs d05e9a56-829d-4c8d-9617-b99928d90357 r HSC 16 HSC-R 0 903334
calexp processCcdOutputs c86d17ed-ea15-412b-bc8c-5c1cc9ec8d58 r HSC 22 HSC-R 0 903334
calexp processCcdOutputs 54bac4e1-62b7-43be-a465-60eba80b6b81 r HSC 23 HSC-R 0 903334
calexp processCcdOutputs 11bbe2b1-9383-48d6-8f7d-4a5b8f03c4c2 r HSC 100 HSC-R 0 903334
calexp processCcdOutputs bf5184c6-58d9-46d5-a26f-4b3a0af1b603 r HSC 17 HSC-R 0 903336
calexp processCcdOutputs 9ac23684-8dcc-4000-8d04-4f21bca5b4ed r HSC 24 HSC-R 0 903336
calexp processCcdOutputs e535a3fc-3816-44c5-8586-87d4e6ba883e r HSC 18 HSC-R 0 903338
calexp processCcdOutputs f275175b-58c5-495e-b0b7-6d65609fe67e r HSC 25 HSC-R 0 903338
calexp processCcdOutputs cc914879-8bc3-4b30-a2d1-e0627d7c8a44 r HSC 4 HSC-R 0 903342
calexp processCcdOutputs 9975b323-cd91-4ae2-bc33-f014fb7f859d r HSC 10 HSC-R 0 903342
calexp processCcdOutputs b2f2d970-7d2a-4d4d-a4e1-b9ed68fc468c r HSC 100 HSC-R 0 903342
calexp processCcdOutputs f55e73a5-bc8e-4e0e-a1b0-3ddc2f1637d3 r HSC 0 HSC-R 0 903344
calexp processCcdOutputs 009406d2-0b4c-455c-a3bf-980188ae4ff3 r HSC 5 HSC-R 0 903344
calexp processCcdOutputs 475aa5a3-9999-4166-b616-220d62dc002d r HSC 11 HSC-R 0 903344
calexp processCcdOutputs e334886d-771e-4f31-9f81-85a414644e6f r HSC 1 HSC-R 0 903346
calexp processCcdOutputs 063c9875-f352-40d3-8b53-ac1b4a951d1e r HSC 6 HSC-R 0 903346
calexp processCcdOutputs 6db19abe-3e24-4d97-b078-7fec8f287171 r HSC 12 HSC-R 0 903346
calexp processCcdOutputs 8fb94e4a-5a95-402e-8e7a-367c63d99a19 i HSC 16 HSC-I 0 903986
calexp processCcdOutputs b010c470-0c25-4699-8e40-ac38f4134876 i HSC 22 HSC-I 0 903986
calexp processCcdOutputs 13df49ac-b069-4a2c-b811-bbedb228491f i HSC 23 HSC-I 0 903986
calexp processCcdOutputs 6fb2e13b-f875-4043-8ec3-cd837d0f18b8 i HSC 100 HSC-I 0 903986
calexp processCcdOutputs 4d81f244-6ee9-4291-9182-bc1da1c631ab i HSC 16 HSC-I 0 903988
calexp processCcdOutputs e16c1f83-4fff-4a09-ac4d-26e9ee9ce1d8 i HSC 17 HSC-I 0 903988
calexp processCcdOutputs 0ed9f825-ae19-4b66-9985-edfb48fc1d5c i HSC 23 HSC-I 0 903988
calexp processCcdOutputs 8daf6c38-53fd-4e6c-805f-69aa63f9101e i HSC 24 HSC-I 0 903988
calexp processCcdOutputs 386414e8-71f3-4e1d-a1fe-4741f90da417 i HSC 18 HSC-I 0 903990
calexp processCcdOutputs 127e39fa-96f7-4b4e-9641-95462010c8e6 i HSC 25 HSC-I 0 903990
calexp processCcdOutputs cb8d8db7-9f75-42e2-822b-daef9ae4634b i HSC 4 HSC-I 0 904010
calexp processCcdOutputs 7f8c8996-0ff1-4439-aa69-5801b198c202 i HSC 10 HSC-I 0 904010
calexp processCcdOutputs a41c4020-88fa-4394-a115-e7c6fce324e4 i HSC 100 HSC-I 0 904010
calexp processCcdOutputs 3091eaa0-91ea-49be-8377-23abd18e220e i HSC 1 HSC-I 0 904014
calexp processCcdOutputs 3c3018fd-15d7-41d6-ae51-25b6e15b6261 i HSC 6 HSC-I 0 904014
calexp processCcdOutputs 30b71142-fc90-4a95-a932-39492f8a61e9 i HSC 12 HSC-I 0 904014

The calexp datasets were written into an output run of your choosing that was called processCcdOutputs. When you run make-discrete-skymap you would specify that collection.

$ butler make-discrete-skymaps ./GEN3_run HSC --collections processCcdOutputs

or something like that.

Hi Tim :slight_smile: thanks a lot for sharing the gen3 tutorial

I wanted to ask if the ingestion of the reference catalog happens as default when you construct the butler in the r2c_subset?
Thanks for your patience as well :slight_smile:

The butler repository used in the tutorial is already prefilled with all necessary datasets.

1 Like

okay, I’ll give that a test. thanks, and kudos on the advent of the official Gen3 tutorial.

appears to have worked, although it was not very apparent to me in past days. I’ll continue on a few more steps then circle back to use the latest/greatest Gen3 tutorial.
thanks; my console below…
$butler make-discrete-skymap ./GEN3_run HSC --collections processCcdOutputs
makeDiscreteSkyMap INFO: Extracting bounding boxes of 33 images
makeDiscreteSkyMap INFO: Computing spherical convex hull
makeDiscreteSkyMap INFO: tract 0 has corners (321.154, -0.596), (320.594, -0.596), (320.594, -0.036), (321.154, -0.036) (RA, Dec deg) and 3 x 3 patches

My coadd pipetask apparently errored with the message below. I think I understand the message…
pipetask run -b GEN3_run/ --input processCcdOutputs --input skymaps --register-dataset-types -p “${PIPE_TASKS_DIR}/pipelines/DRP.yaml”#coaddition --instrument lsst.obs.subaru.HyperSuprimeCam --output-run coadd -c makeWarp:doApplySkyCorr=False -c makeWarp:doApplyExternalSkyWcs=False -c makeWarp:doApplyExternalPhotoCalib=False -c assembleCoadd:doMaskBrightObjects=False
Resulted in
RuntimeError: Error finding datasets of type visitSummary in collections [processCcdOutputs, skymaps]; it is impossible for any such datasets to be found in any of those collections, most likely because the dataset type is not registered. This error may become a successful query that returns no results in the future, because queries with no results are not usually considered an error.
If the options are not clear, pls don’t expend too much effort. I will transition to a restart of Gen3 with the latest tutorial steps announced by Tim…

@parejkoj has given some guidance for this on a different post.

Which version of the software are you using? It looks like visitSummary dataset types are expected to be created as part of single frame processing but that’s really out of my area of expertise.

My latest run-through used v34:
eups distrib install -t w_2021_34 lsst_distrib

I’m repeating gen3 with the latest release per Tim’s recent announcement:
https://pipelines.lsst.io/v/weekly/getting-started/index.html
These tutorial steps reference v22…pls confirm that v22 is what we use. I would have thought gen3 would have a higher version number… So, I guess v22 now includes one of the later/greater weekly updates [i.e. 34 or 35?).
Please confirm.
thanks

…errrr maybe v22 is just referring to the level of the lsst_distrib EUPS package, not the core gen3 content.
thanks

Apologies! I am the one who did the updates to the tutorials and there is a reference to v22 that I missed. An updated version will fix this shortly. The tutorials were developed and tested using the w_2021_33 version of the science pipelines. I’m sorry for the confusion. I’ll post here when the corrected pages are up.

and should be compatible with w34 or w35 if that’s what you have lying around.