Memory leaking while retrieving many images using butler.get()

Hi,
I suspect that there is serious memory leaking issue while I try to get images using butler. Here is my minimal example.

bands=['g','r','i','z','y']; Nband=len(bands)
def retrieve_images1(ra,dec):
    for iband,band in enumerate(bands):
        query = """band.name = '{}' AND patch.region OVERLAPS POINT({}, {})
        """.format(band, ra, dec)
        try:
            dataset_refs = butler.query_datasets("deep_coadd", where=query, with_dimension_records=True,); Nrefs=len(dataset_refs); 
        except Exception as e:
            print(f"band={band}: query failed with error -> {e}")
            continue
        for iref,dsref in enumerate(dataset_refs):
            coadd = butler.get(dsref);
            del coadd
        del dataset_refs

Now if I call this functions with a bunch of (ra,dec) values, the memory consumption steadily grows, soon reaching my 16 GB quota, and eventually killing/freezing my code. I am using

  • RSP
  • the latest kernel
  • jupyter notebook

tracemalloc shows /opt/lsst/software/stack/conda/envs/lsst-scipipe-10.0.0/share/eups/Linux64/meas_extensions_piff/g36ff55ed5b+4036fd6440/python/lsst/meas/extensions/piff/piffPsf.py:69: size=430 MiB in every run, but I doubt this diagnostics.

Q1. I am utterly surprised as to why this has not been mentioned before. Am I the only one facing this?
Q2. Could somebody reproduce the issue? Just run the above function for ~500 different values for (ra,dec), of course within the footprint!

Thanks in advance!

Please use v29.2.0 release of the pipelines rather than the recommended release (you will see it on the container selection screen when you log in). The memory leak is fixed there. We are planning on changing the recommended soon.

2 Likes