Postprocessing pipeline tasks for obs package

dangause · September 5, 2025, 9:31pm

I’m wondering what the current recommended pipe task sequence is for bridging the gap between calibrateImage and coadd assembly.

I’m developing a standard pipeline for the obs_nickel package (1-m telescope, single CCD) which I currently have running through ISR + calibrateImage, with initial_pvi and initial_stars_detector as outputs. In the current version of the stack, what are the next steps (tasks) towards building coadds?

Most of the documentation I can find on this is based on outdated versions of calibrateImage centered around calexps. I’ve referenced other obs package DRP configs, but haven’t been successful in adapting these to obs_nickel.

Is there a current “standard” pipe task sequence compatible with the current calibrateImage's outputs? Also, are initial_pvi and initial_stars_detector temporary dataset types, or more long term? It seems like they were recently changed to preliminary_visit_image, preliminary_visit_image_background, and single_visit_star_unstandardized in the PSTN-019 doc.

I’ve included my ProcessCcd.yaml and run script here for some additional context.

ProcessCcd.yaml

# pipelines/ProcessCcd.yaml
description: ISR + Image Calibration
tasks:
  isr:
    class: lsst.ip.isr.IsrTask
    config:
      doOverscan: True
      doBias: True
      doFlat: True
      doDark: False
      doDefect: True
      doFringe: False
      doSuspect: True
      doWrite: True
      doVignette: True
      fluxMag0T1:
        B: 2e8
        V: 2e8
        R: 2e8
        I: 2e8

  calibrateImage:
    class: lsst.pipe.tasks.calibrateImage.CalibrateImageTask
    config:
      
      # --- initial PSF used by first-pass detection ---
      install_simple_psf.fwhm: 7.0
      install_simple_psf.width: 35

      # ---- Source Detection Config Parameters ----
      psf_detection.thresholdType: "stdev" # default value is 5
      psf_detection.thresholdValue: 6.5     # try 6–7σ to cull faint junk
      psf_detection.minPixels: 7            # require a larger footprint
      psf_detection.includeThresholdMultiplier: 1.0
      psf_detection.reEstimateBackground: true
      psf_detection.doTempLocalBackground: true
      psf_detection.doTempWideBackground: false
      psf_detection.excludeMaskPlanes: ["EDGE","SAT","CR","BAD","NO_DATA","SUSPECT"]
      psf_detection.statsMask: ["BAD","SAT","EDGE","NO_DATA","SUSPECT"]
      psf_detection.isotropicGrow: true
      psf_detection.nSigmaToGrow: 2.0
      psf_detection.combinedGrow: true
          
      # ---- Astrometry Config Parameters ----
      connections.astrometry_ref_cat: gaia_dr3_20250728
      astrometry.forceKnownWcs: False

      # match search‐radius: 60″ ≃ 300 pix @ 0.      # match search‐radius: 60″ ≃ 300 pix @ 0.2″/pix
      astrometry.matcher.maxOffsetPix: 300
      astrometry.matcher.maxRotationDeg: 5.0
      astrometry.matcher.matcherIterations: 12

      # accept a sloppy initial fit up to 60″ mean, then refine
      astrometry.maxMeanDistanceArcsec: 60.0
      astrometry.matchDistanceSigma: 8.0
      astrometry.maxIter: 12
      
      # --- Photometry Config Parameters ---
      connections.photometry_ref_cat: panstarrs1_dr2_20250730

      photometry_ref_loader.filterMap: {"b": "gMeanPSFMag", "v": "gMeanPSFMag", "r": "rMeanPSFMag", "i": "iMeanPSFMag", "u": "gMeanPSFMag"}
      photometry.match.matchRadius: 1.5

      photometry.applyColorTerms: false

subsets:
  processCcd:
    - isr
    - calibrateImage

steps:
  - label: processCcd
    dimensions: [instrument, exposure, detector]

Postprocessing.yaml

# pipelines/Postprocessing.yaml

description: "Nickel postprocess: consolidate visit summaries + make visit table"

tasks:
  consolidateVisitSummary:
    class: lsst.pipe.tasks.postprocess.ConsolidateVisitSummaryTask
    config:
      # Point the task at your Gen3 calibrated exposure dataset
      connections.calexp: "initial_pvi"
      
  makeVisitTable:
    class: lsst.pipe.tasks.postprocess.MakeVisitTableTask
    config:
      connections.visitSummaries: "visitSummary"

Run script

#!/usr/bin/env bash
# Nickel reduction pipeline v2

# bad exposures - exclude:
# BAD="1032,1033,1034,1043,1046,1047,1048,1049,1050,1051,1052,1056,1058,1059,1060"
BAD="1032,1051,1052"

########## ABSOLUTE PATHS (edit if needed) ##########
REPO="/Users/dangause/Desktop/lick/lsst/data/nickel/062424"
RAWDIR="/Users/dangause/Desktop/lick/data/062424/raw"
OBS_NICKEL="/Users/dangause/Desktop/lick/lsst/lsst_stack/stack/obs_nickel"
REFCAT_REPO="/Users/dangause/Desktop/lick/lsst/lsst_stack/stack/refcats"
OVR="/Users/dangause/Desktop/lick/lsst/data/nickel/062424/tuning_runs/trials/t022/calib_overrides_t022.py"
STACK_DIR="/Users/dangause/Desktop/lick/lsst/lsst_stack"

########## BASIC CONFIG ##########
INSTRUMENT="lsst.obs.nickel.Nickel"
RUN="Nickel/raw/all"
TS="$(date -u +%Y%m%dT%H%M%SZ)"

echo "=== Nickel pipeline starting @ $TS ==="

cd "$STACK_DIR"
source loadLSST.zsh
setup lsst_distrib; setup obs_nickel; setup testdata_nickel

cd "$OBS_NICKEL"

########## CREATE & REGISTER ##########
if [ ! -f "$REPO/butler.yaml" ]; then
  butler create "$REPO"
fi
butler register-instrument "$REPO" "$INSTRUMENT" || true
butler ingest-raws "$REPO" "$RAWDIR" --transfer symlink --output-run "$RUN"
butler define-visits "$REPO" Nickel

########## CURATED ##########
CURATED="Nickel/run/curated/$TS"
butler write-curated-calibrations "$REPO" Nickel "$RUN" --collection "$CURATED"

########## BIAS ##########
CP_RUN_BIAS="Nickel/run/cp_bias/$TS"
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN" \
  -o "$CP_RUN_BIAS" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpBias.yaml" \
  -d "instrument='Nickel' AND exposure.observation_type='bias'" \
  --register-dataset-types

# certify bias broadly
butler certify-calibrations "$REPO" "$CP_RUN_BIAS" "$CURATED" bias \
  --begin-date 2020-01-01 --end-date 2030-01-01

########## FLATS ##########
CP_RUN_FLAT="Nickel/run/cp_flat/$TS"
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN","$CP_RUN_BIAS" \
  -o "$CP_RUN_FLAT" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpFlat.yaml" \
  -c cpFlatIsr:doDark=False \
  -c cpFlatIsr:doOverscan=True \
  -d "instrument='Nickel' AND exposure.observation_type='flat'" \
  --register-dataset-types


########## DEFECTS (from flats; updated for Y inversion + unified out dir) ##########
if butler query-datasets "$REPO" flat --collections "$CP_RUN_FLAT" | grep -q '^flat'; then
  DEF_TS="$(date -u +%Y%m%dT%H%M%SZ)"
  DEFECTS_RUN="Nickel/calib/defects/$DEF_TS"
  DEF_DIR="$OBS_NICKEL/scripts/defects/defects_$DEF_TS"

  echo "[defects] Building from flats in $CP_RUN_FLAT -> $DEFECTS_RUN"
  python "$OBS_NICKEL"/scripts/defects/make_defects_from_flats.py \
    --repo "$REPO" \
    --collection "$CP_RUN_FLAT" \
    --invert-manual-y \
    --manual-box 255 0 2 1024 \
    --manual-box 783 0 2 977 \
    --manual-box 1000 0 25 1024 \
    --manual-box 45 120 6 9 \
    --manual-box 980 200 12 8 \
    --register \
    --ingest \
    --defects-run "$DEFECTS_RUN" \
    --plot

  # Only relink 'current' if the run exists.
  if butler query-collections "$REPO" | awk '{print $1}' | grep -qx "$DEFECTS_RUN"; then
    echo "Using defects run: $DEFECTS_RUN"
    butler collection-chain "$REPO" Nickel/calib/defects/current "$DEFECTS_RUN" --mode redefine
  else
    echo "[defects] Expected run $DEFECTS_RUN not found; skipping Nickel/calib/defects/current relink."
  fi
else
  echo "[defects] No 'flat' datasets found in $CP_RUN_FLAT; skipping defects build/ingest."
fi

########## UNIFIED CALIB CHAIN ##########
CALIB_CHAIN="Nickel/calib/current"
butler collection-chain "$REPO" "$CALIB_CHAIN" \
  "$CURATED" "$CP_RUN_BIAS" "$CP_RUN_FLAT" Nickel/calib/defects/current \
  --mode redefine

########## REFCATS (run from refcat repo; original commands) ##########
cd "$REFCAT_REPO"

# Gaia DR3
convertReferenceCatalog \
  data/gaia-refcat/ \
  scripts/gaia_dr3_config.py \
  ./data/gaia_dr3_all_cones/gaia_dr3_all_cones.csv \
  &> convert-gaia.log

butler register-dataset-type "$REPO" gaia_dr3_20250728 SimpleCatalog htm7

butler ingest-files \
  -t direct \
  "$REPO" \
  gaia_dr3_20250728 \
  refcats/gaia_dr3_20250728 \
  data/gaia-refcat/filename_to_htm.ecsv

butler collection-chain \
  "$REPO" \
  --mode extend \
  refcats \
  refcats/gaia_dr3_20250728

# PS1 DR2
convertReferenceCatalog \
  data/ps1-refcat/ \
  scripts/ps1_config.py \
  ./data/ps1_all_cones/merged_ps1_cones.csv \
  &> convert-ps1.log

butler register-dataset-type "$REPO" panstarrs1_dr2_20250730 SimpleCatalog htm7

butler ingest-files \
  -t direct \
  "$REPO" \
  panstarrs1_dr2_20250730 \
  refcats/panstarrs1_dr2_20250730 \
  data/ps1-refcat/filename_to_htm.ecsv

butler collection-chain \
  "$REPO" \
  --mode extend \
  refcats \
  refcats/panstarrs1_dr2_20250730

########## SCIENCE PROCESSING ##########
cd "$OBS_NICKEL"
PIPE="$OBS_NICKEL/pipelines/ProcessCcd.yaml"
PROCESS_CCD_RUN="Nickel/run/processCcd/$(date +%Y%m%dT%H%M%S)"

# quick sanity
butler query-collections "$REPO" | grep -E 'Nickel/calib/(current|defects/current)' || true


pipetask run \
  -b "$REPO" \
  -i "$RUN","$CALIB_CHAIN","refcats" \
  -o "$PROCESS_CCD_RUN" \
  -p "$PIPE#processCcd" \
  -C calibrateImage:configs/calibrateImage/tuned_configs/best_calib_t071.py \
  -d "instrument='Nickel' AND exposure.observation_type='science' AND NOT (exposure IN (${BAD}))" \
  -j 8 --register-dataset-types \
  2>&1 | tee "logs/processCcd_${TS}.log"
  # -C calibrateImage:configs/apply_colorterms.py \
  # --debug \
  # -d "instrument='Nickel' AND exposure.observation_type='science' AND exposure IN (1042)" \
  # -d "instrument='Nickel' AND exposure.observation_type='science'" \


pipetask run \
  -b "$REPO" \
  -i "$PROCESS_CCD_RUN","$CALIB_CHAIN","refcats" \
  -o Nickel/run/postproc/visits/$TS \
  -p ./pipelines/PostProcessing.yaml \
  --register-dataset-types \
  -d "instrument='Nickel' AND exposure.observation_type='science' AND NOT (exposure IN (${BAD}))" \
  -j 8 \
  2>&1 | tee "logs/postproc_visits_${TS}.log"



# Build discrete skymap config from initial_pvi footprints
SKY_CFG="configs/makeSkyMap_discrete_auto.py"
python scripts/build_discrete_skymap_config.py \
  --repo "$REPO" \
  --collections "$PROCESS_CCD_RUN" \
  --dataset-type initial_pvi \
  --skymap-id nickel_discrete \
  --border-deg 0.05 \
  --out "$SKY_CFG"

# Register it
butler register-skymap "$REPO" -C "$SKY_CFG"

# (Optional) sanity
butler query-datasets "$REPO" skyMap --where "skymap='nickel_discrete'"


echo "=== Done ==="
echo "Curated:     $CURATED"
echo "CP Bias:     $CP_RUN_BIAS"
echo "CP Flat:     $CP_RUN_FLAT"
echo "Defects run: ${DEFECTS_RUN:-<none>}"
echo "Calib chain: $CALIB_CHAIN"
echo "Science run: $PROCESS_CCD_RUN"


COADD_RUN="Nickel/run/coadd/$(date -u +%Y%m%dT%H%M%SZ)"

pipetask run \
  -b "$REPO" \
  -i Nickel/run/processCcd,"$CALIB_CHAIN" \
  -o "$COADD_RUN" \
  -p pipelines/CoaddOnly.yaml#makeWarp,assembleCoadd \
  -c makeWarp:connections.inputExposure="initial_pvi" \
  -d "skymap='nickel_discrete' AND tract=0 AND patch='46,46'" \
  -j 4 --register-dataset-types \
  2>&1 | tee "logs/coadd_${TS}.log"

timj · September 5, 2025, 9:46pm

It might be better if you were using v29.2 for your work, since that matches DP1 and the naming conventions we are currently using. Are you still using v28?

dangause · September 6, 2025, 7:36pm

I believe that I am using v29.2, unless I’m incorrectly setting up the most recent weekly release. I’m using lsst-scipipe-10.1.0, which to my understanding corresponds to v29.2.0. Here are some commands to show which versions I’m using:

~/De/l/l/lsst_stack ❯ pwd                                      base 12:25:13
/Users/dangause/Desktop/lick/lsst/lsst_stack
~/De/l/l/lsst_stack ❯ ls                                       base 12:25:18
loadLSST.ash  loadLSST.sh   lsstinstall
loadLSST.bash loadLSST.zsh  stack
~/De/l/l/lsst_stack ❯ source loadLSST.zsh                      base 12:25:19
~/De/l/l/lsst_stack ❯ setup lsst_distrib; setup obs_nickel; setup testdata_nickel
~/Desktop/lick/lsst/lsst_stack ❯ eups list -s lsst_distrib                                                  lsst-scipipe-10.1.0 12:25:30
   gdfb3db0272+933869b8d1 	current w_2025_34 w_latest setup
~/Desktop/lick/lsst/lsst_stack ❯ eups list -t w_latest lsst_distrib                                         lsst-scipipe-10.1.0 12:25:42
   gdfb3db0272+933869b8d1 	current w_2025_34 w_latest setup
~/Desktop/lick/lsst/lsst_stack ❯ conda list rubin-env | grep rubin-env                                      lsst-scipipe-10.1.0 12:25:54
rubin-env                 10.1.0          py312h39879ed_1    conda-forge
rubin-env-nosysroot       10.1.0          py312h911ba04_1    conda-forge

jbosch · September 8, 2025, 2:59pm

There’s quite a bit between calibrateImage and building coadds in our pipelines, but a lot of that is about polishing up the single-visit source catalogs, feeding analysis tasks for quality control prior to coaddition, and multi-visit astrometric and photometric fitting, and I could imagine you not wanting all of that.

I’d recommend starting with $DRP_PIPE_DIR/pipelines/LSSTCam/DRP.yaml as a template and follow it in importing the $DRP_PIPE_DIR/pipelines/_ingredients/base-v2.yaml pipeline and then using exclude directives to drop what you don’t want (you’ll then probably have to reconfigure some connections to reconnect tasks that were separated by an exclusion, e.g. if you drop the multi-visit astrometry, you’ll need to tell updateVisitSummary to use the preliminary astrometry).
Almost every other concrete DRP pipeline in that package now has some historical cruft (which we probably won’t have a chance to remove/fix anytime soon), and you’ll see that those “V2” pipelines do have essentially everything renamed from the pre-DP1 pipelines.

While you’re working, I strongly recommend using:

pipetask build -p <pipeline-yaml> --show pipeline-graph --select-tasks <expression>

to see what the pipeline looks like, where expression can be something like '<=deep_coadd' to select just the tasks that are needed to build the deep_coadd data product. You can find more information on that expression language at Working with Pipeline Graphs — LSST Science Pipelines.

dangause · September 9, 2025, 4:41pm

Hi Jim, thanks for the guidance here – these pipeline yamls are just what I was looking for. You’re correct in assuming that I’ll likely be dropping a fair number of the analysis and multi-visit astrometry/photometry tasks, at least for this first pass. The base-v2.yaml pipeline sequences will get me well on my way.

And I’ll start implementing pipeline graphs in my workflow. Thanks for the suggestion.