Overscan region trimming for obs package

dangause · August 20, 2025, 6:35pm

I’m in the process of debugging my obs package for the Nickel telescope and custom processCcd pipeline for the Nickel Telescope, and I’ve been running into a persistent issue where the overscan region isn’t trimmed in downstream datasets.

Currently I have my obs package running with a basic single frame pipeline consisting of ISR + CalibrateImage steps. I’m working on some finer tuning aspects (defect mask, color terms, etc), but it’s successfully detecting sources and running astrometry / photometry.

It is my understanding that post ISR datasets should have the overscan region fully trimmed out of the image. Yet despite specifying the overscan region in my camera yaml, I’m still seeing the overscan region in postISRCCD and initial_pvi dataset exposures (in addition to flat, bias, etc). I’m also The overscan region appears to be masked as BAD pixels, but it’s definitely still there.

Is this expected behavior? Or am I dealing with overscan incorrectly?

For context, the Nickel CCD has an overscan region in the rightmost 32 columns. I’m also masking ~30 data columns at the rightmost edge of the frame (to the left of the overscan region) to deal with what I’ve determined to be rolloff.

I’m including some visualizations and obs_nickel code snippets below that may be helpful.

camera/nickel.yaml

name: Nickel

# --- Amplifier Template ---
AMP: &AMP
  perAmpData: true
  dataExtent: [1056, 1024]            # full raw frame
  readCorner: LL
  rawBBox: [[0, 0], [1056, 1024]]
  rawDataBBox: [[0, 0], [1025, 1024]]       # Everything left of overscan is signal
  rawSerialPrescanBBox: [[0, 0], [0, 0]]     # No prescan
  rawSerialOverscanBBox: [[1025, 0], [31, 1024]]  # Right-side overscan
  rawParallelPrescanBBox: [[0, 0], [0, 0]]    # No paralell prescan
  rawParallelOverscanBBox: [[0, 0], [0, 0]]   # No parallel overscan
  gain: 1.8
  readNoise: 10.7
  saturation: 65535
  linearityType: PROPORTIONAL
  linearityThreshold: 0
  linearityMax: 65535
  linearityCoeffs: [0.0, 1.0]

# --- CCD Template ---
CCD: &CCD
  detectorType: 0                      # SCIENCE
  refpos: [528.0, 512.0]               # center of 1056 x 1024
  offset: [0.0, 0.0]
  bbox: [[0, 0], [1056, 1024]]
  pixelSize: [0.015, 0.015]            # 15 micron pixels
  transformDict:
    nativeSys: 'Pixels'
    transforms: {}
  transposeDetector: false
  pitch: 0.0
  yaw: 0.0
  roll: 0.0
  amplifiers:
    A00:
      <<: *AMP
      hdu: 0
      ixy: [0, 0]
      flipXY: [false, false]

# --- Global Focal Plane Scale ---
plateScale: 24.7   # arcsec/mm = 0.37 arcsec/pixel ÷ 0.015 mm

# --- Optical Distortion Approximation ---
transforms:
  nativeSys: 'FocalPlane'
  FieldAngle:
    transformType: radial
    coeffs: [0.0, 1.0, 0.0]

# --- Detector Table ---
CCDs:
  CCD0:
    <<: *CCD
    id: 0
    name: CCD0
    serial: Nickel-1
    physicalType: SCIENCE
    refpos: [528.0, 512.0]
    offset: [0.0, 0.0, 0.0]
    amplifiers:
      A00:
        <<: *AMP
        gain: 1.8
        readNoise: 10.7
        hdu: 0
        ixy: [0, 0]
        flipXY: [false, false]

Pipeline Shell Script

#!/usr/bin/env bash

# bad exposures - exclude:
BAD="1032,1033,1043,1046,1047,1048,1049,1050,1051,1052,1056"

########## ABSOLUTE PATHS (edit if needed) ##########
REPO="/Users/dangause/Desktop/lick/lsst/data/nickel/062424"
RAWDIR="/Users/dangause/Desktop/lick/data/062424/raw"
OBS_NICKEL="/Users/dangause/Desktop/lick/lsst/lsst_stack/stack/obs_nickel"
REFCAT_REPO="/Users/dangause/Desktop/lick/lsst/lsst_stack/stack/refcats"

########## BASIC CONFIG ##########
INSTRUMENT="lsst.obs.nickel.Nickel"
RUN="Nickel/raw/all"
TS="$(date -u +%Y%m%dT%H%M%SZ)"

echo "=== Nickel pipeline starting @ $TS ==="

########## CREATE & REGISTER ##########
if [ ! -f "$REPO/butler.yaml" ]; then
  butler create "$REPO"
fi
butler register-instrument "$REPO" "$INSTRUMENT" || true
butler ingest-raws "$REPO" "$RAWDIR" --transfer symlink --output-run "$RUN"
butler define-visits "$REPO" Nickel

########## CURATED ##########
CURATED="Nickel/run/curated/$TS"
butler write-curated-calibrations "$REPO" Nickel "$RUN" --collection "$CURATED"

########## BIAS ##########
CP_RUN_BIAS="Nickel/run/cp_bias/$TS"
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN" \
  -o "$CP_RUN_BIAS" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpBias.yaml" \
  -d "instrument='Nickel' AND exposure.observation_type='bias'" \
  --register-dataset-types

# certify bias broadly
butler certify-calibrations "$REPO" "$CP_RUN_BIAS" "$CURATED" bias \
  --begin-date 2020-01-01 --end-date 2030-01-01

########## FLATS ##########
CP_RUN_FLAT="Nickel/run/cp_flat/$TS"
pipetask run \
  -b "$REPO" \
  -i "$CURATED","$RUN","$CP_RUN_BIAS" \
  -o "$CP_RUN_FLAT" \
  -p "$CP_PIPE_DIR/pipelines/_ingredients/cpFlat.yaml" \
  -c cpFlatIsr:doDark=False \
  -d "instrument='Nickel' AND exposure.observation_type='flat'" \
  --register-dataset-types

########## DEFECTS (from flats; module has no detector args) ##########
DEF_TS="$(date -u +%Y%m%dT%H%M%SZ)"
DEFECTS_RUN="Nickel/calib/defects/$DEF_TS"
QA_DIR="$OBS_NICKEL/scripts/defects/qa_$DEF_TS"

python "$OBS_NICKEL"/scripts/defects/make_defects_from_flats.py \
  --repo "$REPO" \
  --collection "$CP_RUN_FLAT" \
  --register \
  --ingest \
  --defects-run "$DEFECTS_RUN" \
  --plot \
  --qa-dir "$QA_DIR"

# === AUTO-PICK LATEST DEFECTS RUN ===
DEFECTS_RUN="$(butler query-collections "$REPO" | awk '/^Nickel\/calib\/defects\//{print $1}' | tail -n1)"
echo "Using latest defects run: $DEFECTS_RUN"

# point current -> latest defects
butler collection-chain "$REPO" Nickel/calib/defects/current "$DEFECTS_RUN" --mode redefine

########## UNIFIED CALIB CHAIN ##########
CALIB_CHAIN="Nickel/calib/current"
butler collection-chain "$REPO" "$CALIB_CHAIN" \
  "$CURATED" "$CP_RUN_BIAS" "$CP_RUN_FLAT" Nickel/calib/defects/current \
  --mode redefine

########## REFCATS (run from refcat repo; original commands) ##########
cd "$REFCAT_REPO"

# Gaia DR3
convertReferenceCatalog \
  data/gaia-refcat/ \
  scripts/gaia_dr3_config.py \
  ./data/gaia_dr3_all_cones/gaia_dr3_all_cones.csv \
  &> convert-gaia.log

butler register-dataset-type "$REPO" gaia_dr3_20250728 SimpleCatalog htm7

butler ingest-files \
  -t direct \
  "$REPO" \
  gaia_dr3_20250728 \
  refcats/gaia_dr3_20250728 \
  data/gaia-refcat/filename_to_htm.ecsv

butler collection-chain \
  "$REPO" \
  --mode extend \
  refcats \
  refcats/gaia_dr3_20250728

# PS1 DR2
convertReferenceCatalog \
  data/ps1-refcat/ \
  scripts/ps1_config.py \
  ./data/ps1_all_cones/merged_ps1_cones.csv \
  &> convert-ps1.log

butler register-dataset-type "$REPO" panstarrs1_dr2_20250730 SimpleCatalog htm7

butler ingest-files \
  -t direct \
  "$REPO" \
  panstarrs1_dr2_20250730 \
  refcats/panstarrs1_dr2_20250730 \
  data/ps1-refcat/filename_to_htm.ecsv

butler collection-chain \
  "$REPO" \
  --mode extend \
  refcats \
  refcats/panstarrs1_dr2_20250730

########## SCIENCE PROCESSING ##########
cd "$OBS_NICKEL"
PIPE="$OBS_NICKEL/pipelines/ProcessCcd.yaml"
PROCESS_CCD_RUN="Nickel/run/processCcd/$(date +%Y%m%dT%H%M%S)"

# quick sanity
butler query-collections "$REPO" | grep -E 'Nickel/calib/(current|defects/current)' || true

pipetask run \
  -b "$REPO" \
  -i "$RUN","$CALIB_CHAIN","refcats" \
  -o "$PROCESS_CCD_RUN" \
  -p "$PIPE#processCcd" \
   -d "instrument='Nickel' AND exposure.observation_type='science' AND NOT (exposure IN (${BAD}))" \
  --register-dataset-types \
  2>&1 | tee logs/processCcd_$TS.log
  # -d "instrument='Nickel' AND exposure.observation_type='science'" \

echo "=== Done ==="
echo "Curated:     $CURATED"
echo "CP Bias:     $CP_RUN_BIAS"
echo "CP Flat:     $CP_RUN_FLAT"
echo "Defects run: $DEFECTS_RUN"
echo "Calib chain: $CALIB_CHAIN"
echo "Science run: $PROCESS_CCD_RUN"

processCcd.yaml

# pipelines/ProcessCcd.yaml
description: ISR + Image Calibration
tasks:
  isr:
    class: lsst.ip.isr.IsrTask
    config:
      doOverscan: True
      doBias: True
      doFlat: True
      doDark: False
      doDefect: True
      doFringe: False
      doSuspect: True
      doWrite: True
      doTrimToMatchCalib: True
      doVignette: True
      fluxMag0T1:
        B: 2e8
        V: 2e8
        R: 2e8
        I: 2e8

  calibrateImage:
    class: lsst.pipe.tasks.calibrateImage.CalibrateImageTask
    config:
      connections.astrometry_ref_cat: gaia_dr3_20250728
      connections.photometry_ref_cat: panstarrs1_dr2_20250730

      astrometry.forceKnownWcs: False

      # match search‐radius: 60″ ≃ 300 pix @ 0.2″/pix
      astrometry.matcher.maxOffsetPix: 300
      astrometry.matcher.maxRotationDeg: 1.0
      astrometry.matcher.matcherIterations: 12

      # accept a sloppy initial fit up to 60″ mean, then refine
      astrometry.maxMeanDistanceArcsec: 60.0
      astrometry.matchDistanceSigma: 8.0
      astrometry.maxIter: 12
      
      # astrometry_ref_loader.anyFilterMapsToThis: None
      # astrometry_ref_loader.filterMap: {"b": "gMeanPSFMag", "v": "gMeanPSFMag", "r": "rMeanPSFMag", "i": "iMeanPSFMag", "u": "gMeanPSFMag"}
      photometry_ref_loader.filterMap: {"b": "gMeanPSFMag", "v": "gMeanPSFMag", "r": "rMeanPSFMag", "i": "iMeanPSFMag", "u": "gMeanPSFMag"}
      # photometry_ref_loader.filterMap: {"b": "g", "v": "g", "r": "r", "i": "i", "u": "g"}
      photometry.match.matchRadius: 60.0

subsets:
  processCcd:
    - isr
    - calibrateImage

steps:
  - label: processCcd
    sharding_dimensions: visit,detector

czw · August 20, 2025, 7:35pm

If you can post the output log from IsrTask, that may help sorting out what’s going on. The camera definition appears correct to me, and the option that does the trimming (doAssembleCcd), is True in the default IsrTaskConfig. You shouldn’t need the doTrimToMatchCalib option; this was added because some of the precursor data during development had externally supplied bias/dark/flat that were smaller than the detector BBox, and so the science frames needed to be truncated to match that smaller BBox.
Checking the Amplifier definition closer: I think you want the dataExtent to match the extent of the rawDataBBox, so that value should be [1025, 1024]. I don’t think this is likely to be the problem, as I think we largely ignore that field in favor of directly using the rawDataBBox, but fixing that will at least remove that as a possible source of the problem.
Is your obs_nickel publicly available (on github or something similar)?

dangause · August 20, 2025, 8:12pm

Hi Christopher,

Thanks for the reply.

That’s good info on the doTrimToMatchCalib option – I’d included it just to make sure I wasn’t missing anything, but I’ll remove it for future runs. And nice catch on the dataExtent, I’ve adapted it to match the extent of rawDataBBox.

I’ve rerun my pipeline after making your suggested changes, and am still seeing the same untrimmed overscan region in my post-ISR datasets. I’ve attached the corresponding log output here:
processCcd_20250820T200223Z.log (200.1 KB)

And yes, obs_nickel is public – you can access it on my personal github here.

czw · August 20, 2025, 9:59pm

I realized halfway through that I’d need one of your images to test this myself (which makes it more difficult). First, let’s confirm that overscan is doing the right thing. I think it must be, because there are log messages that didn’t trigger. However, if you can check that there is an OVERSCAN keyword in the postISRCCD header, that should have a value of Overscan corrected.
If that’s the case, then the problem must be in the AssembleCcdTask step. If you can checkout and rerun the isr step with the u/czw/assemble_debug.20250820 branch of ip_isr, that will enable a set of info log messages. I expect the assembleInput line to say something like <lsst.afw.image._exposure.ExposureF at 0x7f958e2b93b0>; the outbox line to have Box2I(corner=Point2I(0, 0), dimensions=Extent2I(1025, 1024)) and True; assemble to have A00, another bbox matching [1056, 1024], and a second box matching the outbox line; and the ccdBoxes all agreeing on the smaller bbox sizes. If the outbox line shows False, then somehow the IsrTaskConfig.assembleCcd.doTrim default is being overridden with the wrong value.
I’m reasonably certain the overscan is being masked as BAD because you’ve created some defects from flat, and the overscan region is low relative to the flat illumination, so it’s being detected as having low quantum efficiency. This happens after overscan and trimming should have been done, so I don’t think it impacts your processing (although you’ll need new flats and defects after we fix this error).

dangause · August 21, 2025, 3:15am

I’d be happy to share the set of Nickel exposures I’m using for testing if that would help with debugging – they currently live in a google drive folder, and could grant you access if you see fit.

In the meantime, I just checked the PostISRCCD header, and sure enough it read OVERSCAN = Overscan corrected.

I also cloned your debug branch and reran the isr step for a single exposure (had to fix a typo in python/lsst/ip/isr/issr.cc line 40, changing py::classh<... to py::class_<... in order for the local setup + scons to work). Here’s the log output:

lsst.pipe.base.quantum_graph_builder INFO: Processing pipeline subgraph 1 of 1 with 1 task(s).
lsst.pipe.base.quantum_graph_builder INFO: Iterating over query results to associate quanta with datasets.
lsst.pipe.base.quantum_graph_builder INFO: Initial bipartite graph has 1 quanta, 6 dataset nodes, and 4 edges from 1 query row(s).
lsst.pipe.base.quantum_graph_builder INFO: Generated 1 quantum for task isr.
lsst.ctrl.mpexec.cmdLineFwk INFO: QuantumGraph contains 1 quantum for 1 task, graph ID: '1755745548.125302-91074'
Quanta Tasks
------ -----
     1   isr
lsst.ctrl.mpexec.singleQuantumExecutor INFO: Preparing execution of quantum for label=isr dataId={instrument: 'Nickel', detector: 0, exposure: 1040, band: 'i', day_obs: 20240625, group: '1040', physical_filter: 'I'}.
lsst.ctrl.mpexec.singleQuantumExecutor INFO: Constructing task and executing quantum for label=isr dataId={instrument: 'Nickel', detector: 0, exposure: 1040, band: 'i', day_obs: 20240625, group: '1040', physical_filter: 'I'}.
lsst.isr INFO: Constructing linearizer from cameraGeom information.
lsst.isr INFO: Converting exposure to floating point values.
lsst.isr INFO: Det: CCD0 - Noise provenance: amp, Gain provenance: amp
lsst.isr INFO: Assembling CCD from amplifiers.
lsst.isr.assembleCcd INFO: CZW: assembleInput <lsst.afw.image._exposure.ExposureF object at 0x1678f8070>
lsst.isr.assembleCcd INFO: CZW: outbox (minimum=(0, 0), maximum=(1056, 1024)) doTrim True
lsst.isr.assembleCcd INFO: CZW: assemble A00 (minimum=(0, 0), maximum=(1055, 1023)) (minimum=(0, 0), maximum=(1056, 1024))
lsst.isr.assembleCcd INFO: CZW: ccdBoxes {ccd.getBBox()} {ccd[0].getBBox()} {ccd[0].getRawDataBBox()}
lsst.isr INFO: Applying bias correction.
lsst.isr INFO: Applying linearizer.
lsst.isr INFO: Masking defects.
lsst.isr INFO: Masking non-finite (NAN, inf) value pixels.
lsst.isr INFO: Widening saturation trails.
lsst.isr INFO: Applying flat correction.
lsst.isr INFO: Constructing, attaching, and masking vignette polygon.
lsst.isr INFO: Exposure is fully illuminated? True
lsst.isr INFO: Set 63517 BAD pixels to 224.951729.
lsst.isr INFO: Interpolating masked pixels.
lsst.isr INFO: Setting rough magnitude zero point for filter I: 25.197953
lsst.ctrl.mpexec.singleQuantumExecutor INFO: Execution of task 'isr' on quantum {instrument: 'Nickel', detector: 0, exposure: 1040, band: 'i', day_obs: 20240625, group: '1040', physical_filter: 'I'} took 0.615 seconds
lsst.ctrl.mpexec.mpGraphExecutor INFO: Executed 1 quanta successfully, 0 failed and 0 remain out of total 1 quanta.

So it seems like the outbox line is showing True. Where do you think that gets us?

Also it’s probably worth noting that I was getting the same behavior of no trimming before I even supplied a defects mask / set the defect option to true.

czw · August 21, 2025, 3:42am

In camera.nickel.yaml, change the bbox for CCD to bbox: [[0, 0], [1025, 1024]]. The response in the outbox line says that ip_isr/python/lsst/ip/isr/assembleCcdTask.py at main · lsst/ip_isr · GitHub is not True, so it’s taking the second branch, and pulling the ccd.getBBox() size. Checking obs_lsst, obs_lsst/policy/lsstCamSim.yaml at main · lsst/obs_lsst · GitHub shows that this box must be the trimmed size, as two E2V amplifiers (line 41 of that file) would be 2*2048, which is larger than the listed extent of 4003 (line 95 again).
Sorry I didn’t catch this earlier. This has been something that’s “Just Worked” for us for many years, and I’ve forgotten a lot of the requirements for the YAML definition.

timj · August 21, 2025, 8:00am

What version of the software are you using? Chris is using the current weekly and so the code he sent you is not going to be directly compatible with older versions of the software. In particular we switched to supporting pypbind11 3 recently (to fix a serious memory leak) which will require you to use rubin-env 10.1 and a recent weekly.

Given that you are using what looks like old dataset type names, I worry that you are using a fairly old version. Maybe v28?

dangause · August 21, 2025, 2:10pm

Aha, the CCD to bbox: [[0, 0], [1025, 1024]] fixed it. Overscan is being trimmed as expected now – thanks @czw for the keen eye!

dangause · August 21, 2025, 2:42pm

I’m currently using rubin-env 9.0.0 and lsst_distrib v28_0_1. I’ll upgrade to 10.1 and start using weekly releases for obs_nickel.