Trouble installing DMStack w.2016.20 at NERSC

I’m trying to install both lsst_apps and lsst_sims at NERSC on Cori for Twinkles using lsstsw. When executing “rebuild -r w.2016.20 lsst_apps” the build ultimately fails on mariadbclient. The traceback is below. I’ve had this problem before a few weeks ago when trying a build using newinstall.sh with an earlier build of the code. I heard that others may have had trouble with mariadb as well? Any suggestions for how I might get around this? I have contacted the helpdesk at NERSC, thus far they’ve inquired about our use of mariadb and whether it is truely client-only or perhaps also providing a mariadb server. I suspect it’s a client library, given the name :slight_smile:

 mariadbclient: 10.1.11.lsst2 .....................................ERROR (101 sec).                                                                           
*** error building product mariadbclient.                                         
*** exit code = 1
*** log is in /project/projectdirs/lsst/lsstDM/Cori/temp/lsstsw/build/mariadbclient/_build.log
*** last few lines:
:::::  [2016-05-14T04:40:25.506485Z] -- Looking for sd_notify
:::::  [2016-05-14T04:40:25.796143Z] -- Looking for sd_notify - not found
:::::  [2016-05-14T04:40:25.796790Z] -- Looking for sd_notifyf
:::::  [2016-05-14T04:40:26.020048Z] -- Looking for sd_notifyf - not found
:::::  [2016-05-14T04:40:26.020596Z] -- Systemd features not enabled
:::::  [2016-05-14T04:40:26.023387Z] -- Performing Test HAVE_C__Wvla
:::::  [2016-05-14T04:40:26.382371Z] -- Performing Test HAVE_C__Wvla - Success
:::::  [2016-05-14T04:40:29.078157Z] -- Configuring incomplete, errors occurred!
:::::  [2016-05-14T04:40:29.078229Z] See also "/project/projectdirs/lsst/lsstDM/Cori/temp/lsstsw/build/mariadbclient/CMakeFiles/CMakeOutput.log".
:::::  [2016-05-14T04:40:29.078274Z] See also "/project/projectdirs/lsst/lsstDM/Cori/temp/lsstsw/build/mariadbclient/CMakeFiles/CMakeError.log".

Could you please post the full log, from /project/projectdirs/lsst/lsstDM/Cori/temp/lsstsw/build/mariadbclient/_build.log.

Do you have cmake installed?

My understanding is that the mariadbclient build fails on the SLES 11 of NERSC Cori
because it has glibc 2.11 whereas glibc 2.14 is required. I do not know if this can be
directly solved.

On the login nodes of NERSC Cori /usr/bin/cmake is present (cmake version 2.6-patch 2).

An interesting possibility unique to NERSC Cori is to utilize a docker image constructed elsewhere, download it there to Cori, and run with Shifter through the Slurm scheduler. If this is of interest I may have a docker image that could be used for initial tests, and SQRE would
in time have the official image to make this a sustainable approach.

I purposely loaded the cmake module (it might have been available by default)… but given that I loaded it, as well as gcc 4.9.3, my PATH is pointing to:
heatherk@cori08:/project/projectdirs/lsst/lsstDM/Cori/temp/lsstsw> which cmake
/usr/common/software/cmake/3.3.2/bin/cmake

A docker image would likely be very helpful, now and in the future. Is there a recipe we could utilize to make one using this particular weekly w.2016.20?

Sorry, I should have attached the _build.log at the start, here it is:
_build.log (59.6 KB)

The errors I see are:

[2016-05-14T04:39:17.663300Z] CMake Error at cmake/os/Linux.cmake:29 (STRING):
[2016-05-14T04:39:17.663327Z]   string sub-command REPLACE requires at least four arguments.
[2016-05-14T04:39:17.663339Z] Call Stack (most recent call first):
[2016-05-14T04:39:17.663349Z]   CMakeLists.txt:120 (INCLUDE)
[2016-05-14T04:39:17.663351Z] 
[2016-05-14T04:39:17.663354Z] 
[2016-05-14T04:39:17.663383Z] CMake Error at cmake/os/Linux.cmake:29 (STRING):
[2016-05-14T04:39:17.663400Z]   string sub-command REPLACE requires at least four arguments.

Are you perhaps using an old or new version of CMake?

Do you think cmake/3.3.2 is too new? Is there a recommended version? I haven’t noticed any specific version mentioned here:
https://confluence.lsstcorp.org/display/LSWUG/OSes+and+Prerequisites

I’ve built the stack on Linux/CentOS7 with cmake=2.8.11 + gcc=4.8.5 , and on OSX/El Capitan with cmake=3.5.1 and clang.
From that, I’d have thought that your version was fine, but maybe someone else has more info.

If we update the lsst_sims and lsst_apps conda release to use w.2016.20 (our last release was w.2016.15), do you know if that would solve the problem or would you still have problems loading/running the binaries? We’re due to update the sims release shortly anyway, but wouldn’t have particular reason to update the afw release tag otherwise.

We’ve successfully used the conda binaries at NERSC, as recently as last month, so that would likely solve the issue. We’re hoping to do a test run of Twinkles at NERSC using w.2016.20 this month/early June.
At some point, we would like to demonstrate an ability to build from source, or use a docker image or some mechanism to introduce a patched version of the stack from source without having to burden the DM team with providing binaries for our use at NERSC. We can certainly build on SLAC’s public RHEL6 machines without a problem. Earlier this week we were hoping we could take those SLAC binaries over to NERSC, but that unfortunately did not work out… that code wasn’t precisely w.2016.20.

I’m using cmake 3.5.2 so it may be that your cmake is too old. I don’t think we’ve tried ancient versions.

@josh what versions of cmake are on the Jenkins machines?

The failure Paul identifies above is actually occurring in the following code:

# Fix CMake (< 2.8) flags. -rdynamic exports too many symbols.
FOREACH(LANG C CXX)
  STRING(REPLACE "-rdynamic" "" 
  CMAKE_SHARED_LIBRARY_LINK_${LANG}_FLAGS
  ${CMAKE_SHARED_LIBRARY_LINK_${LANG}_FLAGS}  
  )
ENDFOREACH()

I’d suggest that a special code path for for CMake < 2.8 which requires versions > 3.3.2 would be of questionable usefulness.

The declared requirement in the MariaDB build scripts is:

CMAKE_MINIMUM_REQUIRED(VERSION 2.6)

Based on squinting at the code, I assume the problem is that ${CMAKE_SHARED_LIBRARY_LINK_C_FLAGS} and/or ${CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS} is empty. I’m not immediately clear why that should be, though.

1 Like

ok, now I am observing a successful build of mariadbclient: 10.1.11.lsst2 on NERSC Cori using a
cmake 2.8.12.2 installed in user space when the env setting
export CC=gcc
is in place. Hopefully this is a reproducible / comprehensible occurrence.

I’ll give that a try and report back. I was just noting that SLAC seems to have cmake 2.8.12.2 installed. Thank you!

Frossie has informed me that the SQRE docker images are updated to w_2016_20, and I have downloaded an image with lsst_apps and lsst_distrib to NERSC Cori :

module load shifter
shifterimg images | grep lsstsqre

cori docker READY 732106219a 2016-05-17T08:48:10 lsstsqre/centos:7-stack-lsst_distrib-w_2016_20

If Sims had an interest in trying to use Shifter to run off SQRE docker images on NERSC Cori, they could write a comment to the Jira issue DM-6119 (e.g., add lsst_sims to the docker image?).

Moving to cmake 2.8.12.2 and setting “export CC=gcc” fixed the issue :slight_smile: I have a successful build on Cori and will test it out. Thank you very much @daues.

The docker image also sounds very interesting. For Twinkles we would need lsst_sims. @tony_johnson has been looking at using Docker at NERSC. He might want to pursue this further.