Using lsstsw to rebuild relocated DMstack

For Twinkles (a DESC project) we have an installation of DMstack at SLAC. Unfortunately this version does not correspond to any tag. The code installed was the master at that particular time (a couple of weeks back). We wish to rebuild this version of the source elsewhere and I was hoping to do that using lsstsw.
Here is what I have done so far:

  • Copied the installation which was originally installed using lsstsw. This area has what appears to be a typical directory structure for a lsstsw install.
  • Removed the miniconda subdirectory (since Anaconda is not relocatable)
  • Reinstalled miniconda and did: conda install --file etc/conda_packages-linux-64.txt
  • Removed the contents of share/ups_db and share/Linux64
  • sourced bin/

I was hoping to do: rebuild lsst_apps but it appears the system is trying to download from git, which is precisely what I do not want to do… and for packages like afw, it doesn’t appear anything was rebuilt… well… it happened so fast that I’m dubious. Here is the traceback:

python_psutil: Traceback (most recent call last):
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/bin/lsst-build", line 51, in <module>
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 758, in run
    manifest = p.construct(args.products)
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 721, in construct
    self._add_product_tree(products, name)
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 709, in _add_product_tree
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 709, in _add_product_tree
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 709, in _add_product_tree
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 709, in _add_product_tree
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 709, in _add_product_tree
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 692, in _add_product_tree
    ref, sha1 = self.product_fetcher.fetch(productName)
  File "/afs/slac/g/glast/users/heather/Twinkles/lsstsw/lsst_build/python/lsst/ci/", line 305, in fetch
    raise Exception("Failed to clone product '%s' from any of the offered repositories" % product)

I’ll freely admit I haven’t used lsstsw or lsst-build thus far - so I’m looking to be educated. I’ve poked around here: but I believe what I’m trying to do is non-typical :slight_smile:

Take care,

Is there a reason you are not just creating a new lsstsw build system in the new location? Run bin/deploy to set things up first rather than trying to do manual fix ups. By definition lsstsw uses git clones so I’m not entirely sure why you don’t want to do that. Do you really want to use lsstsw or should you be using eups distrib install?

I may not be explaining the situation very well @timj - what was installed at SLAC was a snapshot of the master at that particular time with some added patches. I cannot now go back and do a git clone from the repository, because master has since moved. (and to be clear - I’m just walking into this situation and didn’t create it!)
The hope is to rebuild an exact copy of that installed build elsewhere on another machine (in this case at NERSC)… I believe my only option is to copy the existing lsstsw install, so I can get the source, and rebuild it. This would have been much easier if there was a tag or a weekly build to point too, but unfortunately we do not have that luxury in this case.

But I am completely open to the idea that I’m not approaching this the best way… so I would appreciate suggestions.

Now that I’ve reformatted your post I can see that the specific problem is that your local repos.yaml file is out of date because utils now has a new dependency (and lsstsw is trying to build current master not your old master).

thanks for cleaning up my post @timj :slight_smile:
as far as I can see, lsstsw seems more suited for updating and rebuilding… but that’s not what I want to do in this case. Should I be running a lsst-build command directly?
lsst-build build builddir
though I do that, and the system informs me that everything is already installed… perhaps I need to clear out some of the subdirectories in the pre-existing build area…

I don’t think you can relocate a stack, so you have to rebuild. However, not having a git tag to point to really does complicate things. You could make lsstsw build a particular git tag, but otherwise it defaults to building the current master.
Perhaps if you have the commit sha1 of each version of the packages you’re trying to recreate, there is a way to use that?

I think @KSK is helping you in another medium. lsstsw is designed for building the current system (it can build arbitrary git refs such as tags and branches but you only get consistency if that ref exists on all packages).

If you have a build/ directory that was built by an earlier rebuild command you can re-run exactly that tree by removing the installed packages from your stack directory and then running lsst-build manually to bypass the git updating:

./lsst_build/bin/lsst-build build "`pwd`/build”

(you may have to run git clean -fdx in each of the git repos in build/ first to ensure that everything is build from scratch).

You should be able to tar up a eups repository. What went wrong?

I also think we may be able to relocate except for the miniconda. I’m working with @heather999 to try to relocate the stack we have. If that doesn’t work, we will throw in the towel and rebuild (or conda install, or whatever). At least I think that’s what will happen.

@ljones That might be possible, we do have the manifest.txt file that would contain all the SHAs.

@timj I think doing the “git clean” was the missing piece. @KSK is making a tarball of the existing binaries for us - so we’ll give that a try at NERSC. I still think it’s useful to know how to get out of such a pickle in the future… and it remains to be seen if the binaries relocate nicely to NERSC. I have doubts but I’m happy to be surprised!
All these tools (lsstsw, lsst-build, etc) are very useful, but there are times I would almost be happier knowing what the basic underlying commands are to do a “clean, make, make install” or its effective equivalent.

@RHL I suspect in our haste to get going we weren’t as neat and tidy about things as we could have been. But I’ll also quickly admit I’m a little hazy on your comment - are you just suggesting we should have saved a copy of the source for posterity in this case since we didn’t have a tag?

@KSK as far as what happens if this doesn’t work - beats me - I’m waiting for the Twinkles-gods to provide inspiration! :slight_smile:

I’m suggesting that if you had a working stack you could have copied to the new machine via tar or rsync and it would have worked. Or “should have worked”.

Can we take a build from RH6-64 and use it on a NERSC machine @RHL? It’s not quite the same OS, which is why I was wondering. It’s certainly worth a shot, and that’s what @KSK was suggesting…
We have copied over the binaries to NERSC and run into some trouble with eups’ setup files, they have the original SLAC path hard-coded in them - but perhaps we can rebuild eups and fix it? We’re discussing this on GitHub here:

@heather999 I think there are two things here. @RHL is suggesting that we should be able to pick up and move the stack directory from the tarball I pointed you to. That may be, but we’d need to get miniconda and eups. Maybe the thing to do is:

$> git clone
$> cd lsstsw
$> ./bin/deploy

At this point you’ll have a working miniconda and a working eups for your system (you’ll have to do all the module stuff you normally do), then copy just the stack directory into the root level of the new lsstsw at NERSC. Does that sound doable?

I’ll give it a try and report back @KSK - thanks!

Made some good progress… I did all of the above that you suggested @KSK. Just a couple of wrinkles:
in the lsstsw/bin/deploy script, I updated the curl command for eups to include the --cacert option and pointed it to a local copy of the ca-bundle.crt file, otherwise I received errors like:
curl: (77) error setting certificate verify locations:
CAfile: /etc/pki/tls/certs/ca-bundle.crt

Also cloned to run a test. That initially failed due to being unable to locate, I symlinked the existing and to and respectively. Now I hit upon another unknown library: I’m poking around to see if I can figure out if there’s another version available, but if you have any suggestions, I’d appreciate it! :slight_smile:

File "/project/projectdirs/lsst/lsstDM/Cori/b2016/lsstsw/stack/Linux64/ip_isr/2016_01.0-7-g806b452+13/python/lsst/ip/isr/", line 34, in <module> _isrLib = swig_import_helper()
File "/project/projectdirs/lsst/lsstDM/Cori/b2016/lsstsw/stack/Linux64/ip_isr/2016_01.0-7-g806b452+13/python/lsst/ip/isr/", line 30, in swig_import_helper_mod = imp.load_module('_isrLib', fp, pathname, description)
File "/project/projectdirs/lsst/lsstDM/Cori/b2016/lsstsw/stack/Linux64/base/2.2016.10/python/", line 102, in imp_load_module module = orig_imp_load_module(name, *args)
ImportError: cannot open shared object file: No such file or directory

We’ve had a similar problem to this reported back in October which we fixed by disabling unicode in Boost (DM-4006). What version of boost do you have in your stack? You need a version greater than or equal to 1.59.lsst2.

Thatnks, @timj. It appears to be boost 1.60

We updated that a month ago so it should have the unicode disabling. I wonder what library is pulling in unicode. Your problem is that you probably have slightly different versions of unicode libraries on each machine. I’d like to know what part of the system is using that library though.

This could be the point this all falls apart. @heather999 is trying to relocate a stack from one cluster to another, so we don’t have the luxury of rebuilding.

That library should come as part of libicu. I don’t know if there is a module that can be enabled to provide that.

It may well be there, but with a different version. That’s the problem that triggered our change to boost that came from doing the CernVMFS distro.