singleFrameMeasurement source ID issues

akilgall · February 17, 2016, 4:17pm

Hello,

When I try to use the singleFrameMeasurement task to do analysis on an input image, the output catalog gives an ID for each source that is just an ascending integer. For instance, the input image contains a ‘db_id’ for each source that range from the thousands (stars) to 10 digit numbers (galaxies + tile info): [2214322509, 2213221225, 2213832765, …, 5284, 5286, 5288]

However, the output catalog contains a vector of IDs that are merely integers ranging from 1 to the number of sources:
id_measured: [ 1 2 3 …, 12650 12651 12652]

I’ve been trying to look through the DM stack code to figure out where I can modify the code so that the parent ID is propagated through to the output catalog, but I’ve had no luck with that. My current “solution” to this involves a double loop over the input and output sources in order to match them, but it takes a long time to run. I was hoping for a better and quicker solution than that.

Here’s my use of the singleFrameMeasurement task:

exposure = loadData(infile)
schema = afwTable.SourceTable.makeMinimalSchema()

config = SourceDetectionTask.ConfigClass()
config.thresholdPolarity = "both"
config.background.isNanSafe = True
config.thresholdValue = 0.5
detectionTask = SourceDetectionTask(config=config, schema=schema)
print config

config = SingleFrameMeasurementTask.ConfigClass()
config.plugins.names.clear()
for plugin in ["base_SdssCentroid", "base_SdssShape", "base_CircularApertureFlux", "base_GaussianFlux", "base_PsfFlux", "base_ClassificationExtendedness"]:
config.plugins.names.add(plugin)

print "fluxRatio: ", config.algorithms['base_ClassificationExtendedness'].fluxRatio
config.algorithms['base_ClassificationExtendedness'].fluxRatio = 0.9375

measureTask = SingleFrameMeasurementTask(schema, config=config)

tab = afwTable.SourceTable.make(schema)
result = detectionTask.run(tab, exposure)
sources = result.sources

for i in range(len(sources)):
    record = sources[i]
    idparent = record.getParent()
    print "idparent: ", idparent

measureTask.run(sources, exposure)
sources.writeFits(outfile)

Could someone point out where the stack code must be modified? I’ve had no luck searching through the code or the documentation and was hoping that someone would be able to quickly answer my question.

Thanks,

Aaron

RHL · February 17, 2016, 5:23pm

I don’t think I understand quite what you are asking. Where are these IDs coming from – an external catalogue?

akilgall · February 17, 2016, 5:45pm

The 10 digit IDs are coming from the OneSqDeg.fits file extracted using the WeakLensingDeblending package (http://weaklensingdeblending.readthedocs.org/en/latest/catalog.html)

The star IDs are from a separate star catalog. These were merged together using the WeakLensingDeblending script, simulate_star.py. This produced the catalog that I’m using as an input, as I’m trying to analyze star/galaxy separation in the stack.

jbosch · February 17, 2016, 5:54pm

So the catalog with the IDs was used as an input to the simulator that made the images, but otherwise those catalogs aren’t being used as an input to the script that’s processing the images, correct?

If that’s the case, you’d need to have some way to match an object in your input catalog with an object in the measurement catalog; there’s no way for the pipeline to get those from the image, because the IDs aren’t in it. If both catalogs have positions, you could probably do a spatial match using something like lsst.afw.table.matchXy, then loop over the results to set the IDs in the measurement catalogs - but be prepared for there to be some matches that aren’t 1-to-1 unless all of your objects are isolated.

akilgall · February 22, 2016, 2:33am

Yes, I incorrectly assumed that this ID would have been propagated through for testing purposes.

I wasn’t able to use lsst.afw.table.matchXy as there was no documentation for how to convert a fits binary array into the required input class, but I was able to implement a tree matching algorithm using a python module (which vastly improved the computational time).

Thanks for the help!