Link time outliers on the pybind11 branch

I’ve discovered that the pybind11 branch of afw builds extremely slowly on my machine (Arch Linux, Core i7-5600u@2.6GHz, gcc 6.3): the total time to build only the afw python libraries (with the main C++ library built) is about 60 minutes (-j4), with most of that going to linking. That’s true whether I use the default GNU linker or gold (though it’s possible setting -fuse-ld=gold isn’t doing what I think it’s doing). I’m not running out of memory; usage (~6GB) is no where near my total (16GB), and I’m not seeing any swap usage.

But this does not happen with clang (3.9; on the same system): building the python libraries takes only 10 minutes. OS X clang was about the same. And on another Linux system (Ubuntu 14.04, Core i5-2500@3.3GHz), gcc 4.8 builds the python libraries in only 5 minutes.

So I’ve either configured my system badly in some way, or there’s been a massive regression in gcc linking performance between 4.8 and 6.3 that (as far as I can tell) the internet is unaware of.

Given that most development happens on machines with clang and most production and CI work happens with gcc 4.x, I’m not completely terrified that this will cause a lot of disruption when we all switch to pybind11, but I’d really like to get the problem cases narrowed down, and of course ideally we’d find a way to make the problem systems go faster.

I’d love to have some help timing lsstsw rebuild -r tickets/DM-8467 afw on other systems. Would it be possible to spin up some one-off VMs with different OS/compiler versions for this, via Nebula or something? Does anyone have a regular development box interestingly different from what I’ve tried, and the ability to do this test without wasting too much human-time on it?

And please watch DM-9121 if you’re interested in following this further.

I have a fast ubuntu machine with gcc 5.4.0, if it would help. I can give you an account.