Verification Datasets Meeting 2016-02-10 Minutes

nidever · February 10, 2016, 10:34pm

Attendees: David, Colin, Mario, MWV, Hsin-Fang, Simon, Ciardi, Gruendl, Darko, Swinbank
Notetaker: David

Discussion of ideas on how to improve this call

David’s goal for this call was a forum for people working on the verification datasets to present their work in progress and have a discussion on issues as a team; but open to suggestions on how to improve it
MWV suggested people send out their plots the day before so people have time to digest them; we decided to send out call reminder as community post and people can reply with links to their work.
Simon suggested using DM tech notes, more composed fashion, but can be work in progress; Colin and Angelo are using already and MWV said he will use in the future for his work.
opened the possibility of having the call every other week; people felt that there was enough progress and material to cover to have it every week. will keep the one week cadence.
please send future comments, suggestions, critiques on the verification datasets call directly to David

Colin:

DMTN-006, 80% of the way there about the false-positive rates
http://dmtn-006.lsst.io/en/latest/
- deals with variance plane in CP
- getting variance plane in difference images correct
- covariance is important
Paul Price’s paper on correlated noise (Price & Magnier 2010, unpublished)
Colin performs forced photometry in the individual images for objects detected in the
difference image, throw out anything that isn’t detected at 5 sigma in individual images
RHL, used “sky objects” in SDSS, objects extracted from background pixels, very useful
false positives, couple hundred sources per sq deg., roughly consistent with number of
false positives (i.e. Poisson noise), maybe still a bit higher
dipoles rejected explicitly
RHL/MWV, is the slope of # of false positives vs. detection threshold (sigma) consistent
with expectations?
Mario, goal is push below 1000 false positives per sq deg to be able to detect asteroids,
but enough data there to make the plot. next is to run MOPS on the detected sources

Mario:

MOPS update
Joachim has MOPS running, still not sure why it’s crashing on linux
spent time on developing visualization tools to help with MOPS failures and debugging
fraction of asteroids linked to all those that could have been linked, currently ~75%, aiming for 95%
trying to make sure the definition is same as what Mario/Lynne used for MAF
will check on linux problem, probably memory issue
also check where MOPS has “blind spots” in parameter space
whole chain should be ready in another 2 months
need to be done for a NASA report by June (?)
linkages aren’t done in realtime, next day

Simon:

no updates on Twinkles, trying to get everything ready for phosim
group at UW wants to do image stacking via SQL, cloud computing, need dataset; CFHT LS or DECam COSMOS was suggested.

MWV:

will write tech note on astrometric and photometric validation work, plans to present at JTM
try to turn code into a task that can be run on any arbitrary dataset/repository
will try to run on HSC data in a month time or so

Ciardi:

PanSTARRS data, thinking about how to put it in our environment
need to think about this more DM-wide
Mario, think data already spinning at STSci
that data will probably mainly help Qserv and SUI
just getting the data will take a while
Ciardi will start outlining what would make that data useful, run passed Mario/K-T/Jeff
Mario will start email to STSci about getting copy of data
Colin: would be great to also use these database/tools with the data we are generating internally with verification datasets

Hsin-Fang:

testing Russell’s new processCcd, it’s working well
DECam geom geometry object, distortion already included looks good
Simon: Russell is ready to merge unless there are any objections

Angelo:

bulge dataset processing
- still not sure what’s wrong
- thinks is a problem with the astrometry from CP processing, maybe redoing ISR with stack could help
- even if get past matcher there’s a threshold in next step (RMS of matched sources) that it fails on
verification datasets are a great use case for developing QA tests and tools, want to preserve the knowledge develop by this group, make easy to reuse the code and reproduce the results
we had pipeQA in the past, can do better now building a more extensible prototype which this group could adopt, maintain and refine
The first step for this is the database model - would appreciate feedback on this
http://sqr-008.lsst.io/en/latest/
The sample queries illustrate what could be done with such a framework

David:

spending most time recently on script to calculate QA metrics using the stack on processCcd output data products
can run on an entire data repository and will output a summary file of all the metrics, eventually still be used to load the QA database directly
https://confluence.lsstcorp.org/display/SQRE/COSMOS+DECam+data+reduction
recreated the ellipticity vs. FWHM plot and now it normally very normal. there does seem to be a problem with the CP ellipticity values, the FWHM values actually aren’t so bad
try vs. background/sec
look at the cosmic ray issue some more. plotting number of pixels flagged as “CR” per second still shows a big band dependence. I also plotted with average sky background and there is a strong correlation with the sky even for exposures with the same band
looked at individual exposures and there are many “streak” or “worm” CRs that the CR routine is not picking up, so clear sign of why the higher background exposures have more CRs than the low background exposures.
will investigate some more and take a closer look at what the CR code is doing

afausti · February 11, 2016, 4:04am

David, I have edited the notes to better reflect my intention in writing the technote sqr-008 and just to mention that there will be an RFD about it. Thanks.