Custom "patchy" RefCats for CalibrateImage

Hi all,

I’m developing an obs package for UCO’s 1m Nickel Telescope, with the immediate goal of single frame processing (intensions to extend to diff imaging, etc). I’m using a small set of Nickel exposures (~50 science and ~20 bias/flat frames) as an initial test set.

My obs_nickel package is successfully ingesting my test set, and I’ve got a custom ProcessCCD pipeline working up through an ISR task. I initially had it working through an additional CharacterizeImage step (and had started the Calibrate step), but decided to switch to CalibrateImage in following @parejkoj’s comment in this forum discussion.

My question has two parts:

1. Is there any reason why I can’t use a custom mini reference catalog comprised of small circular areas centered on each exposure’s RA/Dec in my small test set? I’m developing on my personal machine, and don’t really want a 500GB+ refcat sitting on here for this initial round of development. I took 1º conical subsets of the Gaia DR3 catalog corresponding to each exposure’s RA/Dec, resulting in a dataset of ~65k sources.

2. How do I specify a custom refcat for a CalibrateImageTask? It looks like it’s defaulting to ps1_pv3_3pi_20170110, which I don’t don’t have downloaded. I was able to specify my custom refcat for the CalibrateTask by just listing the chained refcat collection as a pipetask input, but that doesn’t seem to be working for CalibrateImageTask as I’m getting the error:

    raise MissingDatasetTypeError(
lsst.daf.butler._exceptions.MissingDatasetTypeError: "DatasetType 'ps1_pv3_3pi_20170110' referenced by 'calibrateImage' uses 'skypix' as a dimension placeholder, but has not been registered with the data repository.  Note that reference catalog names are now used as the dataset type name instead of 'ref_cat'."

I’m using lsst-scipipe-9.0.0, and here are some of my custom files for context:

gaia_dr3_config.py

# gaia_dr3_config.py

# Name of the output reference catalog dataset
config.dataset_config.ref_dataset_name = "gaia_dr3"

# Use the Gaia-specific conversion logic
from lsst.meas.algorithms import convertRefcatManager
config.manager.retarget(convertRefcatManager.ConvertGaiaManager)

# Tune parallelism as needed
config.n_processes = 4

# Gaia DR3 column mappings
config.id_name = "source_id"
config.ra_name = "ra"
config.dec_name = "dec"
config.ra_err_name = "ra_error"
config.dec_err_name = "dec_error"

config.parallax_name = "parallax"
config.parallax_err_name = "parallax_error"
config.coord_err_unit = "milliarcsecond"

config.pm_ra_name = "pmra"
config.pm_ra_err_name = "pmra_error"
config.pm_dec_name = "pmdec"
config.pm_dec_err_name = "pmdec_error"

config.epoch_name = "ref_epoch"
config.epoch_format = "jyear"      # Same as used in Gaia DR2
config.epoch_scale = "tcb"

# List of Gaia DR3 photometric magnitude columns
config.mag_column_list = ["phot_g_mean", "phot_bp_mean", "phot_rp_mean"]

# Optional extra columns to carry along
config.extra_col_names = []

Create RefCat

convertReferenceCatalog \
  data/gaia-refcat/ \
  scripts/gaia_dr3_config.py \
  ./data/gaia_dr3_all_cones/gaia_dr3_all_cones.csv \
  &> convert-gaia.log

butler register-dataset-type "$REPO" gaia_dr3_20250728 SimpleCatalog htm7

butler ingest-files \
  -t direct \
  "$REPO" \
  gaia_dr3_20250728 \
  refcats/gaia_dr3_20250728 \
  data/gaia-refcat/filename_to_htm.ecsv

butler collection-chain \
  "$REPO" \
  --mode extend \
  refcats \
  refcats/gaia_dr3_20250728

ProcessCcd.yaml

# pipelines/ProcessCcd.yaml
description: ISR + Image Calibration
tasks:
  isr:
    class: lsst.ip.isr.IsrTask
    config:
      doOverscan: True
      doBias: True
      doFlat: True
      doDark: False
      doDefect: False
      doSuspect: True
      doWrite: True
      doTrimToMatchCalib: True
      doVignette: True

  calibrateImage:
    class: lsst.pipe.tasks.calibrateImage.CalibrateImageTask

subsets:
  processCcd:
    - isr
    - calibrateImage

steps:
  - label: processCcd
    sharding_dimensions: visit,detector

Pipeline Run Code

#! bin/bash

# === Setup ===
export REPO=~/Desktop/lick/lsst/data/nickel/062424
export RAWDIR=~/Desktop/lick/data/062424/raw
export RUN=Nickel/raw/all
export INSTRUMENT=lsst.obs.nickel.Nickel
export TS=$(date +%Y%m%dT%H%M%SZ)

# === Create and Register ===
butler create "$REPO"
butler register-instrument "$REPO" "$INSTRUMENT"
butler ingest-raws "$REPO" "$RAWDIR" --transfer symlink --output-run "$RUN"
butler define-visits "$REPO" Nickel

# === Curated Collection ===
export CURATED=Nickel/run/curated/$TS
butler write-curated-calibrations "$REPO" Nickel "$RUN" --collection "$CURATED"

# === Bias Calibration ===
export CP_RUN_BIAS=Nickel/run/cp_bias/$TS
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN" \
  -o "$CP_RUN_BIAS" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpBias.yaml" \
  -d "instrument='Nickel' AND exposure.observation_type='bias'" \
  --register-dataset-types

# === Certify Bias ===
butler certify-calibrations "$REPO" "$CP_RUN_BIAS" "$CURATED" bias \
  --begin-date 2020-01-01 \
  --end-date 2030-01-01

# === Flat Calibration ===
export CP_RUN_FLAT=Nickel/run/cp_flat/$TS
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN","$CP_RUN_BIAS" \
  -o "$CP_RUN_FLAT" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpFlat.yaml" \
  -c cpFlatIsr:doDark=False \
  -d "instrument='Nickel' AND exposure.observation_type='flat'" \
  --register-dataset-types

# === Science Processing: ISR + characterize + calibrate (with refcats) ===
export PROCESS_CCD_RUN=Nickel/run/processCcd/$TS
export PIPE=./pipelines/ProcessCcd.yaml
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN","$CP_RUN_BIAS","$CP_RUN_FLAT","refcats" \
  -o "$PROCESS_CCD_RUN" \
  -p "$PIPE#processCcd" \
  -d "instrument='Nickel' AND exposure.observation_type='science'" \
  --register-dataset-types

Thanks for the help,
Dan

  1. Yes, you can use a small reference catalog. ci_hsc and ci_imsim both use small suites of test data from testdata_ci_hsc and testdata_ci_imsim, respectively, and you can do the same for testing.

  2. You’ll need to override the connections for the reference catalogs in CalibrateImageTask. See calibrateImage.py for what those are (this will be in $PIPE_TASKS_DIR/python/lsst/pipe/tasks/calibrateImage.py locally). You’ll need something like:

    class: lsst.pipe.tasks.calibrateImage.CalibrateImageTask
    config:
     connections.astrometry_ref_cat: my_dataset_type
     connections.photometry_ref_cat:  my_dataset_type
2 Likes

@dtaranu Thanks for your answer – I was missing the connections specification in my config, now I’ve got my pipeline working with my piecemeal refcat.