Python script memory management on RSP

Hi there, I am trying to run my data analysis codes on RSP for my downloaded images, but got some troubles with the memory issue.

My codes run thorugh a list of objects in a loop, something like:

for obj in obj_list:
    analysis(obj)

However, the RAM usage keeps accumulating as it loops though the objects. I have put these lines to clean the memory at the end of each loop:

    plt.close()
    gc.collect()
    clear_output(wait=True)

And also deleted the referrers:

def deep_delete(obj):
    for ref in gc.get_referrers(obj):
        if isinstance(ref, dict):
            for k, v in list(ref.items()):
                if v is obj:
                    del ref[k]
        elif isinstance(ref, list):
            while obj in ref:
                ref.remove(obj)
    del obj

But it keeps accumulating, as I check psutil.virtual_memory() before and after I run on one object, it increases from

svmem(total=33651441664, available=23263485952, percent=30.9, used=9868546048, free=18948648960, active=2638352384, inactive=10409881600, buffers=1091153920, cached=3743092736, shared=29171712, slab=1425485824)
Total RAM: 31.34 GB
Available RAM: 21.67 GB
Used RAM: 9.19 GB
Memory Usage: 30.9%
Current Python process is using 6.62 GB of RAM

to:

svmem(total=33651441664, available=22799486976, percent=32.2, used=10332549120, free=18456993792, active=2639040512, inactive=10881220608, buffers=1091153920, cached=3770744832, shared=29171712, slab=1425707008)
Total RAM: 31.34 GB
Available RAM: 21.23 GB
Used RAM: 9.62 GB
Memory Usage: 32.2%
Current Python process is using 7.10 GB of RAM

I am not accessing the calexp images, bulters, and data catalogs of LSST in this code, but just doing a loop of analysis. I thought as all the class objects and the variables will be rewritten in the loop, the RAM usage should stay stable when it goes through? I am not sure what causes the memory leakage here. I’d appreciate your thoughts!

1 Like

What is obj in your loop?

What are you doing in the loop?

If you comment out the matplotlib code does it still leak?

At the bottom of your browser window you get real time reporting of your memory usage.

Hi Tim, thanks for the inputs! I have turned off all the image show functions, but it still leaks.

By looking at the real time memory, I have figured out it happens when I try to:
Call some plot functions from a package, then save the figure with savefig, then plt.close(). So without showing anything, just make and save the figure seem to cause the leakage.

The plot function structure is something like:

def plot_func()
    take data from a class object
    do some calculations
    make some plots from the data
    savefig
    plt.close()

Does this help: Optimizing Matplotlib Performance: Handling Memory Leaks Efficiently - DEV Community

ie add plt.clf() before plt.close().

matplotlib.plt is notoriously leaky since it is using global state. We have migrated away from using it.

You may want to consider switching to lsst.utils.plotting.make_figure.

1 Like

The function I am using is hard coded in a package (lenstronomy.Plots.model_plot.ModelPlot), it might not be easy to switch the whole staffs to lsst plotting…

I have tried to put plt.clf() in that specific funtion that I am using, the leakage still exists. Every time I call it, it accumulates ~0.05 GB memory.

Maybe to make this issue more general, if we want to use some packages with hard coded matplotlib funtions on RSP, would it be better to download the data we need, and work on our local HPC?

lsst-utils is on pypi but the point is that the code in there is only a few lines and it would be interesting to know if that fixes the leaks. matplotlib.plt is notoriously bad for use in production code because of the leaks and globals. It’s great for quick plots but we realized long ago that we couldn’t use it in library code. If you can show that the leak goes away if you switch to the scheme used in make_figures then that might be a good reason to get the upstream package you are using to move away from using plt.

I don’t think your problem is an RSP issue as such, but a more general problem with memory leaks that are made noticeable by the RSP notebook servers having a fixed amount of memory and no swap.

Hi Tim, I have digged in the code a bit more, it turns out that the issue is caused by the plt.savefig() function. When I want to save a figure in pdf format, it causes the memory leak. Once I change the figure format to .png, it is fixed (sorry that I didn’t mention the figure format above).

It seems that matplotlib.plot.savefig(".pdf") uses a C backend (most likely a C library called Cairo) to render the pdf file. So it’s likely the memory leak is in that interface, and gc.collect() doesn’t have access to the C objects in memory.

I have tested the same code on my laptop in spyder, the savefig .pdf version is also fine. So I think this might be an issue with the pdf writer of RSP.

All I can suggest is that you compare matplotlib versions and cairo versions. Running conda list in a RSP terminal will tell you all the versions.

If you switch from the recommended container to the current weekly you will get a fully updated conda environment (with python 3.13).