Custom "patchy" RefCats for CalibrateImage

dangause · July 28, 2025, 11:42pm

Hi all,

I’m developing an obs package for UCO’s 1m Nickel Telescope, with the immediate goal of single frame processing (intensions to extend to diff imaging, etc). I’m using a small set of Nickel exposures (~50 science and ~20 bias/flat frames) as an initial test set.

My obs_nickel package is successfully ingesting my test set, and I’ve got a custom ProcessCCD pipeline working up through an ISR task. I initially had it working through an additional CharacterizeImage step (and had started the Calibrate step), but decided to switch to CalibrateImage in following @parejkoj’s comment in this forum discussion.

My question has two parts:

1. Is there any reason why I can’t use a custom mini reference catalog comprised of small circular areas centered on each exposure’s RA/Dec in my small test set? I’m developing on my personal machine, and don’t really want a 500GB+ refcat sitting on here for this initial round of development. I took 1º conical subsets of the Gaia DR3 catalog corresponding to each exposure’s RA/Dec, resulting in a dataset of ~65k sources.

2. How do I specify a custom refcat for a CalibrateImageTask? It looks like it’s defaulting to ps1_pv3_3pi_20170110, which I don’t don’t have downloaded. I was able to specify my custom refcat for the CalibrateTask by just listing the chained refcat collection as a pipetask input, but that doesn’t seem to be working for CalibrateImageTask as I’m getting the error:

    raise MissingDatasetTypeError(
lsst.daf.butler._exceptions.MissingDatasetTypeError: "DatasetType 'ps1_pv3_3pi_20170110' referenced by 'calibrateImage' uses 'skypix' as a dimension placeholder, but has not been registered with the data repository.  Note that reference catalog names are now used as the dataset type name instead of 'ref_cat'."

I’m using lsst-scipipe-9.0.0, and here are some of my custom files for context:

gaia_dr3_config.py

# gaia_dr3_config.py

# Name of the output reference catalog dataset
config.dataset_config.ref_dataset_name = "gaia_dr3"

# Use the Gaia-specific conversion logic
from lsst.meas.algorithms import convertRefcatManager
config.manager.retarget(convertRefcatManager.ConvertGaiaManager)

# Tune parallelism as needed
config.n_processes = 4

# Gaia DR3 column mappings
config.id_name = "source_id"
config.ra_name = "ra"
config.dec_name = "dec"
config.ra_err_name = "ra_error"
config.dec_err_name = "dec_error"

config.parallax_name = "parallax"
config.parallax_err_name = "parallax_error"
config.coord_err_unit = "milliarcsecond"

config.pm_ra_name = "pmra"
config.pm_ra_err_name = "pmra_error"
config.pm_dec_name = "pmdec"
config.pm_dec_err_name = "pmdec_error"

config.epoch_name = "ref_epoch"
config.epoch_format = "jyear"      # Same as used in Gaia DR2
config.epoch_scale = "tcb"

# List of Gaia DR3 photometric magnitude columns
config.mag_column_list = ["phot_g_mean", "phot_bp_mean", "phot_rp_mean"]

# Optional extra columns to carry along
config.extra_col_names = []

Create RefCat

convertReferenceCatalog \
  data/gaia-refcat/ \
  scripts/gaia_dr3_config.py \
  ./data/gaia_dr3_all_cones/gaia_dr3_all_cones.csv \
  &> convert-gaia.log

butler register-dataset-type "$REPO" gaia_dr3_20250728 SimpleCatalog htm7

butler ingest-files \
  -t direct \
  "$REPO" \
  gaia_dr3_20250728 \
  refcats/gaia_dr3_20250728 \
  data/gaia-refcat/filename_to_htm.ecsv

butler collection-chain \
  "$REPO" \
  --mode extend \
  refcats \
  refcats/gaia_dr3_20250728

ProcessCcd.yaml

# pipelines/ProcessCcd.yaml
description: ISR + Image Calibration
tasks:
  isr:
    class: lsst.ip.isr.IsrTask
    config:
      doOverscan: True
      doBias: True
      doFlat: True
      doDark: False
      doDefect: False
      doSuspect: True
      doWrite: True
      doTrimToMatchCalib: True
      doVignette: True

  calibrateImage:
    class: lsst.pipe.tasks.calibrateImage.CalibrateImageTask

subsets:
  processCcd:
    - isr
    - calibrateImage

steps:
  - label: processCcd
    sharding_dimensions: visit,detector

Pipeline Run Code

#! bin/bash

# === Setup ===
export REPO=~/Desktop/lick/lsst/data/nickel/062424
export RAWDIR=~/Desktop/lick/data/062424/raw
export RUN=Nickel/raw/all
export INSTRUMENT=lsst.obs.nickel.Nickel
export TS=$(date +%Y%m%dT%H%M%SZ)

# === Create and Register ===
butler create "$REPO"
butler register-instrument "$REPO" "$INSTRUMENT"
butler ingest-raws "$REPO" "$RAWDIR" --transfer symlink --output-run "$RUN"
butler define-visits "$REPO" Nickel

# === Curated Collection ===
export CURATED=Nickel/run/curated/$TS
butler write-curated-calibrations "$REPO" Nickel "$RUN" --collection "$CURATED"

# === Bias Calibration ===
export CP_RUN_BIAS=Nickel/run/cp_bias/$TS
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN" \
  -o "$CP_RUN_BIAS" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpBias.yaml" \
  -d "instrument='Nickel' AND exposure.observation_type='bias'" \
  --register-dataset-types

# === Certify Bias ===
butler certify-calibrations "$REPO" "$CP_RUN_BIAS" "$CURATED" bias \
  --begin-date 2020-01-01 \
  --end-date 2030-01-01

# === Flat Calibration ===
export CP_RUN_FLAT=Nickel/run/cp_flat/$TS
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN","$CP_RUN_BIAS" \
  -o "$CP_RUN_FLAT" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpFlat.yaml" \
  -c cpFlatIsr:doDark=False \
  -d "instrument='Nickel' AND exposure.observation_type='flat'" \
  --register-dataset-types

# === Science Processing: ISR + characterize + calibrate (with refcats) ===
export PROCESS_CCD_RUN=Nickel/run/processCcd/$TS
export PIPE=./pipelines/ProcessCcd.yaml
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN","$CP_RUN_BIAS","$CP_RUN_FLAT","refcats" \
  -o "$PROCESS_CCD_RUN" \
  -p "$PIPE#processCcd" \
  -d "instrument='Nickel' AND exposure.observation_type='science'" \
  --register-dataset-types

Thanks for the help,
Dan

dtaranu · July 29, 2025, 3:02am

Yes, you can use a small reference catalog. ci_hsc and ci_imsim both use small suites of test data from testdata_ci_hsc and testdata_ci_imsim, respectively, and you can do the same for testing.
You’ll need to override the connections for the reference catalogs in CalibrateImageTask. See calibrateImage.py for what those are (this will be in $PIPE_TASKS_DIR/python/lsst/pipe/tasks/calibrateImage.py locally). You’ll need something like:

    class: lsst.pipe.tasks.calibrateImage.CalibrateImageTask
    config:
     connections.astrometry_ref_cat: my_dataset_type
     connections.photometry_ref_cat:  my_dataset_type

dangause · July 29, 2025, 5:33pm

@dtaranu Thanks for your answer – I was missing the connections specification in my config, now I’ve got my pipeline working with my piecemeal refcat.