Implicit threading intervention

Subsequent to RFC-142, I have just merged DM-4714 which makes an intervention when the user is about to get in trouble due to implicit threading.

Some of our low-level math packages (OpenBLAS and MKL) support threading, and default to using the same number of threads as CPUs present on the machine. When parallelising at a higher level, as we generally do with processCcd.py and the -j flag, this can result in a net decrease in processing speed because the parallelism on top of parallelism causes lots of thread contention. We think that the decision to use threading should be explicit rather than implicit; hence DM-4714.

Now when you use the -j flag, we check to see if you’re about to get in trouble and intervene. If you haven’t explicitly told your math package to use a certain number of threads through an environment variable, then you will see a warning (on stderr) telling you that we have disabled threading:

WARNING: You are using OpenBLAS with multiple threads (16), but have not
specified the number of threads using one of the OpenBLAS environment variables:
OPENBLAS_NUM_THREADS, GOTO_NUM_THREADS, OMP_NUM_THREADS.
This may indicate that you are unintentionally using multiple threads, which may
cause problems. WE HAVE THEREFORE DISABLED OpenBLAS THREADING. If you know
what you are doing and want threads enabled implicitly, set the environment
variable LSST_ALLOW_IMPLICIT_THREADS.

To avoid the warning, you can do one of two things:

  1. Set one of the environment variables controlling the number of threads to use:
  • OpenBLAS: OPENBLAS_NUM_THREADS, GOTO_NUM_THREADS, OMP_NUM_THREADS
  • MKL: MKL_NUM_THREADS, MKL_DOMAIN_NUM_THREADS, OMP_NUM_THREADS
  • I suggest using OMP_NUM_THREADS, since it’s recognised by both.
  1. Set the environment variable LSST_ALLOW_IMPLICIT_THREADS, which says “I know what I’m doing and I don’t want you fiddling with anything”.

The function that does the intervention is lsst.base.disableImplicitThreading. Presently, it’s only called before running a CmdLineTask when we’re going to use multiprocessing (i.e., the -j command-line flag), but there’s no reason why you can’t call it in other circumstances if you find you’re getting hurt by threads. There has been a suggestion that we should always call it by default; while I think that’s a good idea (because you can still get in trouble by calling multiple CmdLineTasks in parallel), it hasn’t been formally proposed and agreed upon yet.

There are also some functions to get and set the number of threads in lsst.base (getNumThreads and setNumThreads) in case you want them.

Please let me know if this new feature causes any problems.

1 Like

Just a random thought: TMV and Eigen both have the option of using multiple threads as well. I’m fairly certain Eigen doesn’t use multiple threads unless we explicitly ask it to in the code, but I’m not sure about TMV; that may depend on some compilation flags. Does anyone know if our current build will cause it to pay attention to this intervention (and the environment variables) in the desired way?

Looking at an old version of TMV I had lying around on my laptop, it’s looking for OpenMP and using that if it finds it (by default). In that sense, it should behave like MKL, but it wasn’t on my radar when I wrote this.

Update: the intervention is also applied in the pipe_drivers scripts as well as the pipe_tasks scripts.