Verification Datasets Meeting 2016-02-10 Minutes

Attendees: David, Colin, Mario, MWV, Hsin-Fang, Simon, Ciardi, Gruendl, Darko, Swinbank
Notetaker: David

Discussion of ideas on how to improve this call

  • David’s goal for this call was a forum for people working on the verification datasets to present their work in progress and have a discussion on issues as a team; but open to suggestions on how to improve it
  • MWV suggested people send out their plots the day before so people have time to digest them; we decided to send out call reminder as community post and people can reply with links to their work.
  • Simon suggested using DM tech notes, more composed fashion, but can be work in progress; Colin and Angelo are using already and MWV said he will use in the future for his work.
  • opened the possibility of having the call every other week; people felt that there was enough progress and material to cover to have it every week. will keep the one week cadence.
  • please send future comments, suggestions, critiques on the verification datasets call directly to David


  • DMTN-006, 80% of the way there about the false-positive rates
    • deals with variance plane in CP
    • getting variance plane in difference images correct
    • covariance is important
  • Paul Price’s paper on correlated noise (Price & Magnier 2010, unpublished)
  • Colin performs forced photometry in the individual images for objects detected in the
    difference image, throw out anything that isn’t detected at 5 sigma in individual images
  • RHL, used “sky objects” in SDSS, objects extracted from background pixels, very useful
  • false positives, couple hundred sources per sq deg., roughly consistent with number of
    false positives (i.e. Poisson noise), maybe still a bit higher
    dipoles rejected explicitly
  • RHL/MWV, is the slope of # of false positives vs. detection threshold (sigma) consistent
    with expectations?
  • Mario, goal is push below 1000 false positives per sq deg to be able to detect asteroids,
    but enough data there to make the plot. next is to run MOPS on the detected sources


  • MOPS update
  • Joachim has MOPS running, still not sure why it’s crashing on linux
  • spent time on developing visualization tools to help with MOPS failures and debugging
  • fraction of asteroids linked to all those that could have been linked, currently ~75%, aiming for 95%
  • trying to make sure the definition is same as what Mario/Lynne used for MAF
  • will check on linux problem, probably memory issue
  • also check where MOPS has “blind spots” in parameter space
  • whole chain should be ready in another 2 months
  • need to be done for a NASA report by June (?)
  • linkages aren’t done in realtime, next day


  • no updates on Twinkles, trying to get everything ready for phosim
  • group at UW wants to do image stacking via SQL, cloud computing, need dataset; CFHT LS or DECam COSMOS was suggested.


  • will write tech note on astrometric and photometric validation work, plans to present at JTM
  • try to turn code into a task that can be run on any arbitrary dataset/repository
  • will try to run on HSC data in a month time or so


  • PanSTARRS data, thinking about how to put it in our environment
  • need to think about this more DM-wide
  • Mario, think data already spinning at STSci
  • that data will probably mainly help Qserv and SUI
  • just getting the data will take a while
  • Ciardi will start outlining what would make that data useful, run passed Mario/K-T/Jeff
  • Mario will start email to STSci about getting copy of data
  • Colin: would be great to also use these database/tools with the data we are generating internally with verification datasets


  • testing Russell’s new processCcd, it’s working well
  • DECam geom geometry object, distortion already included looks good
  • Simon: Russell is ready to merge unless there are any objections


  • bulge dataset processing
    • still not sure what’s wrong
    • thinks is a problem with the astrometry from CP processing, maybe redoing ISR with stack could help
    • even if get past matcher there’s a threshold in next step (RMS of matched sources) that it fails on
  • verification datasets are a great use case for developing QA tests and tools, want to preserve the knowledge develop by this group, make easy to reuse the code and reproduce the results
  • we had pipeQA in the past, can do better now building a more extensible prototype which this group could adopt, maintain and refine
  • The first step for this is the database model - would appreciate feedback on this
  • The sample queries illustrate what could be done with such a framework


  • spending most time recently on script to calculate QA metrics using the stack on processCcd output data products
  • can run on an entire data repository and will output a summary file of all the metrics, eventually still be used to load the QA database directly
  • recreated the ellipticity vs. FWHM plot and now it normally very normal. there does seem to be a problem with the CP ellipticity values, the FWHM values actually aren’t so bad
    try vs. background/sec
  • look at the cosmic ray issue some more. plotting number of pixels flagged as “CR” per second still shows a big band dependence. I also plotted with average sky background and there is a strong correlation with the sky even for exposures with the same band
  • looked at individual exposures and there are many “streak” or “worm” CRs that the CR routine is not picking up, so clear sign of why the higher background exposures have more CRs than the low background exposures.
  • will investigate some more and take a closer look at what the CR code is doing

David, I have edited the notes to better reflect my intention in writing the technote sqr-008 and just to mention that there will be an RFD about it. Thanks.