I am running into a frustrating issue while stacking together archive HyperSuprimeCam images for a piece of sky I am looking at.
For some small fraction of my coadd runs I get a bad termination error, which seg faults with exit code 139. In most cases, these stacks have four input visits, one of which is larger than the others (as in, one visit covers pretty much the whole tile, and the other 3 lesser fractions). If I remove the larger visit from the coadd input, things are working just fine.
I initially thought this might be a memory related issue, but I have run plenty of other deeper stacks with more input visits, and also still larger input fractions (as in, I have run stacks with tens of input visits, several of which cover the full sky tile, on the same machine, with the same setup, without issue).
Is there any way to get more info out of the runs, and/or run with some higher verbosity or debug, to try to get more of a handle on why this is happening. I have visually inspected the rogue piece of problem image, and found nothing strange that jumps out at me, so running out of things to try.
relevant log snippet
coaddDriver.assembleCoadd.detectTemplate INFO: Detected 26588 positive peaks in 11404 footprints to 5 sigma
coaddDriver.assembleCoadd.detectTemplate INFO: Detected 26588 positive peaks in 11404 footprints to 5 sigma
coaddDriver.assembleCoadd.scaleWarpVariance INFO: Renormalizing variance by 0.988779
coaddDriver.assembleCoadd.scaleWarpVariance INFO: Renormalizing variance by 0.988779
coaddDriver.assembleCoadd.detect INFO: Detected 7686 positive peaks in 946 footprints and 3301 negative peaks in 794 footprints to 5 sigma
coaddDriver.assembleCoadd.detect INFO: Detected 7686 positive peaks in 946 footprints and 3301 negative peaks in 794 footprints to 5 sigma
coaddDriver.assembleCoadd.scaleWarpVariance INFO: Renormalizing variance by 1.119721
coaddDriver.assembleCoadd.scaleWarpVariance INFO: Renormalizing variance by 1.119721
coaddDriver.assembleCoadd.detect INFO: Detected 7124 positive peaks in 1671 footprints and 3707 negative peaks in 1122 footprints to 5 sigma
coaddDriver.assembleCoadd.detect INFO: Detected 7124 positive peaks in 1671 footprints and 3707 negative peaks in 1122 footprints to 5 sigma
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 23716 RUNNING AT ippc134
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions