I’ve been successful with the 6 step LSST Science Pipeline exercises provided through https://pipelines.lsst.io . All went well, even though I went through it two times so I could dig into the butler repository mechanics. As a DB specialist, I’ve gotta always research the details.
Okay, now I intend to attempt the same exercises through the so-called StackClub, employing Jupyter notebooks and GitHub:
Since I’m just getting started on this StackClub approach, can someone please advise if this path makes sense. Will I find an organized chronology in running the pipeline here?
Also, in my reading I picked up that there is a StackClub Slack workspace. Anything to be shared about this would be appreciated.
Thanks
Yes, as @ktl says this is the old “Butler” system. The new one is a complete rewrite whilst retaining many of the core concepts. We now support PostgreSQL and object stores which “gen2” did not do, and the new system allows much more dynamic creation of workflows.
butler.get and butler.put are more or less the same API but everything else is different.
You can search on community for some hints. See for example here.
Okay, Tim, thanks. I’ve cloned the StackClub Jupyter notebooks, went through the Get Started and Basics.
Know that I am trying to run these stand-alone on my Unix machine before any access via VPN, etc…
I progressed to the aft_table_guided_tour notebook. This notebook presumes I have a login and access to the /datasets/ and /project/ directories at the NCSA facility, which I do not at this time…I’m trying to run through the Jupyter notebooks stand-alone before taking this step.
So, here is my question.
Tim, do you know if I can download a zip or tarball that contains the test data/directories for use with these notebooks? Even though I’ve cleverly substituted elements from Steps 1 - Step 6 of https://pipelines.lsst.io , I’d really like to get the actual data that is referenced by the Jupyter notebooks to get the proper benefit from my stand-alone exercises.
Fred klich
Dallas
I’m not involved in stack club so can’t really answer that. Maybe @MelissaGraham has some pointers.
For the tutorial data in testdata_ci_hsc we have gen3 code that does do the equivalent processing and if you look at ci_hsc_gen3/SConstruct at master · lsst/ci_hsc_gen3 · GitHub you can see the gen3 commands in that file (butler create, butler register-instrument etc) to use to set up a repository with the test data. (if you clone ci_hsc_gen3 with the lsst_distrib and testdata_ci_hsc packages set up then you can run python $(command -v scons) -j 4 to setup a repo and run a pipeline). Pipeline execution is shown in the bin/pipelines.sh file in that repo.
Fingers crossed Melissa may have additional information.
Okay, Tim, I’m saving this message for downline reference. I figure that after I make my best-effort at the SlackClub Jupyter pipeline, I will return to exercise the Gen3 code you’re describing. Looking forward to it.
Appreciate you quick response.
Hello to @MelissaGraham
Yes, I plan to attend the repeat session tomorrow at 6pm CST.
I’m also replaying the 6 recorded StackClubCourse(s) at this link: GitHub - LSSTScienceCollaborations/StackClubCourse: Repository for course offered by the Stack Club .
I’m still holding on to my prior question. In reviewing the above Jupyter Notebooks, I see a minimal amount of externals refs to files, folders, Butler repositories and other artifacts just to accommodate a test runthrough of the Notebooks. My question is if “there is a zip of these supporting files” that I can use to run my own local set of these notebooks on my unix questions. After I finish the above 6 Sessions, I may reach out to Simon Krughoff to see if my question is logical…hoping not to make a pest of myself.
Thanks again Tim; hope you’re someplace “not so hot”.