15.0 in a Jupyter Notebook Docker image, and a 'pycosat' install error

I’m building a Jupyter Notebook with the LSST software added, and my Dockerfile reads:

FROM jupyter/minimal-notebook:latest
USER root
RUN apt-get update && \
    apt-get upgrade -yq --no-install-recommends \
    wget \
    bzip2 \
    ca-certificates \
    sudo \
    locales \
    fonts-liberation \
    bison \
    cmake \
    curl \
    flex \
    g++ \
    gettext \
    git \
    less \
    libbz2-dev \
    libcurl4-openssl-dev \
    libfontconfig1 \
    libglib2.0-dev \
    libncurses5-dev \
    libreadline6-dev \
    libx11-dev \
    libxrender1 \
    libxt-dev \
    m4 \
    make \
    default-jre \
    perl-modules \
    rsync \
    zlib1g-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*
RUN mkdir /opt/lsst && chown $NB_USER /opt/lsst
USER $NB_USER
    
# python3 updates and additional files
RUN conda install --quiet --yes \
    'conda-build' \
    'pandas=0.21*' \
    'matplotlib=2.1.0' \
    'scipy=1.0.0' \
    'seaborn=0.8.1' \
    'scikit-learn=0.19.1' \
    'scikit-image=0.13.0' \
    'sympy=1.1.1' \
    'cython=0.27.3' \
    'cloudpickle=0.5.2' \
    'numba=0.35.0' \
    'numpy=1.13.3' \
    'ffmpeg=3.4' \
    'bokeh=0.12.10' \
    'sqlalchemy=1.1.13' \
    'hdf5=1.10.1' \
    'h5py=2.7.0' \
    'beautifulsoup4=4.6.0' \
    'quantecon=0.3.7' \
    'mpld3=0.3' \
    'numexpr=2.6.4' \
    'pycosat=0.6.3' \
    'statsmodels=0.8.0' \
    'nltk=3.2.5' \
    'spectral=0.18' \
    'folium=0.5.0' \
    'tensorflow=1.3.0' \
    'ipywidgets=7.1' \
    'xlrd=1.1.0' && \
    conda clean -tipsy
# For some reason, I need to install these AFTER installing the stuff above
RUN conda install --quiet --yes \
    'vega' \
    'gdal=2.1.4' \
    'geopandas=0.3.0' \
    'basemap=1.1.0' \
    'basemap-data-hires=1.1.0' \
    'netcdf4=1.3.1' \
    'windrose=1.6.3' \
    'rasterstats=0.12.0' \
    'jupyterhub=0.7.2' \
&& conda remove --quiet --yes --force qt pyqt && \
conda clean -tipsy

# Now install
WORKDIR /opt/lsst
# Configure environment
ENV LC_ALL=en_US.UTF-8 \
    LANG=en_US.UTF-8 \
    LANGUAGE=en_US.UTF-8

RUN curl -OL https://raw.githubusercontent.com/lsst/lsst/master/scripts/newinstall.sh && \
    chmod +x newinstall.sh && \
    bash ./newinstall.sh -b -P /opt/conda/bin/python && \
    bash -c "source /opt/lsst/loadLSST.bash ; eups distrib install lsst_distrib; eups distrib install sims_maf -t sims"
     
WORKDIR /home/$NB_USER

(The Jupyter notebook is Ubuntu 16:04)

The conda install definitely adds ‘pycosat’:

 pycosat-0.6.3              |           py36_0         201 KB  conda-forge

… which is an upgrade:

pycosat:                       0.6.3-py36h0a5515d_0          defaults    --> 0.6.3-py36_0     conda-forge

… and the old version is removed:

removing pycosat-0.6.3-py36h0a5515d_0

HOWEVER!

When it comes to the LSST install, it wants to reinstall pycosat-0.6.3-py36h0a5515d_0… and then barfs:

UnsatisfiableError: The following specifications were found to be in conflict:
  - conda -> pycosat[version='>=0.6.3']
  - pycosat==0.6.2=py36_0
Use "conda info <package>" to see the dependencies for each package.

If I drop back to the 14.0 version… I’m fine.

Any hints?

We don’t need pycosat as such. It’s pulled in by another package that we do need. We force explicit versions when building our system standalone in the test environment so that we know exactly what we are building against. The packages we actually require are: numpy scipy matplotlib requests cython sqlalchemy astropy pandas numexpr bottleneck h5py pyyaml (and nomkl on linux). This is what you get if you set up a conda LSST environment using the -b flag for bin/deploy. We do build docker images already if you want to base your system on our image. We also have JupyterLab deployments under test.

Thanks Tim.

Our Notebooks service is not moving to 'Labs for a while - and now, probably not until the 2019/20 academic year (academic teaching environment).

The challenge with jupyter-anything is that one needs to be able to launch a notebook-server (from JupyterHub) which starts a pre-configured web-GUI… and not start a terminal, do some config, and then start a notebook-server.

I’ve tried all sorts of things, and it’s just not playing ball… is there an assumption that the underlying OS will be redhat? The Jupyter notebooks are built on top of Ubuntu.

I’ve tried pre-loading the libraries Tim mentioned, I’ve tried down-grading the pycosat library, and I’ve tried forking the Jupyter notebook-stack and building on this-weeks new-version of ubuntu… none of them work.

My next one will be to try building the whole Notebook stack on top of the LSST Docker image (but that seems messy to me)

Um…

Have you got an account on lsst-dev?

If so, maybe you can try https://lsst-lspdev.ncsa.illinois.edu/nb to get to our JupyterLab environment , and then see what’s missing in the provided LSST Stack Python environment, and then open a terminal and pip install --user the things you need?

Implementing Jupyter on top of the stack is nontrivial.

At the least, doing it this way would tell you what you actually need in addition to what’s in the stack.

We make images available with the stack already on - you might want to start with one of those? https://pipelines.lsst.io/install/docker.html

@frossie - you do, indeed, provide up-to-date docker images with the full LSST stack on it… however to start a Notebook-Server you need to start the container, run some commands, and then start the 'server - this makes it hard to start from JupyterHub.

You might want to look at https://github.com/lsst-sqre/jupyterlabdemo/tree/master/jupyterlab for inspiration. There’s no reason you couldn’t start with something like that but set it up to fire up a classic Notebook server rather than the Lab (admittedly, you may not need the user provisioning fanciness–I don’t know what your persistent storage model looks like).

Fundamentally it’s a matter of adding the necessary packages on top of the stack container and then arranging for your entrypoint to do whatever user setup you need before starting the server that the Hub connects to.

@adam - I’ve seen this before (plus https://lsst-uk.atlassian.net/wiki/spaces/HOME/pages/141328388/LSST+Simulation+Docker+Image and Installing lsst_sims failed inside docker)

I’ll have another look (and yes, I could add all the notebook server stuff on-top of the LSST docker, rather than add the LSST stack on-top is a notebook docker)

(edit… also found this: https://hub.docker.com/r/oboberg/maf/)

The problem is that newinstall.sh has no option to install cutting edge versions of packages. It’s been written solely to force specific versions. This is not true for the lsstsw build system we use to build from git which has an option to use whatever are the current versions. I think you may be able to put in a dot file in your system to con newinstall.sh into thinking that it’s done the conda installs already, although I can’t remember what the file is. @josh ?

@timj newinstall.sh does not use lockfiles.

Ok, so doing the newinstall and then forcing updates of all the packages should work fine.

Can someone explain the eups distrib install bit - is it installing python packages, or lsst libraries?

I’ve got past pycostat problem from newinstall.sh and now butting heads on doxygen

  • if I try & install 15_0, it wants to install 200+ items, of which doxygen 1.8.5.lsst1 is one - which fails with an ICONV error
  • if I try & install 14_0, it wants to install 100+ items, of which doxygen 1.8.5.lsst1 is one - which installs just fine.

(I’m building on a Jupyter Notebook base, which is Ubuntu 16:04 [xenial] - however I’ve tried rebuilding their notebooks based on Ubuntu 18:04 [Bionic]… and both fail)