I have been trying to install the LSST Science Pipelines, but I cannot build the jointcal pipeline.
I have been running the command:
eups distrib install -t w_2022_44 lsst_distrib
I am trying to install the pipeline on an Ubuntu 22.04 machine, which supports astropy=5.1 and has 4.1.0 rubin-env in the newinstall.sh file.
I already tried the approach described in Failure to build daf_butler - #20 by plah, but I cannot go beyond the jointcal
package. Below there is a copy of the latest error message, any help would be much appreciated.
***** error: from /home/valerio/SCRIPTS/LSST/lsst_track/stack/miniconda3-py38_4.9.2-4.1.0/EupsBuildDir/Linux64/jointcal-ge7491f621d+a8fda97056/build.log:
Coverage XML written to file tests/.tests/pytest-jointcal.xml-cov-jointcal.xml
============================= slowest 5 durations ==============================
12.52s call tests/test_jointcal_cfht_minimal.py::JointcalTestCFHTMinimal::test_jointcalTask_2_visits_photometry
12.05s call tests/test_jointcal_cfht_minimal.py::JointcalTestCFHTMinimal::test_jointcalTask_2_visits_photometry_magnitude
3.69s call tests/test_photometryModel.py::ConstrainedFluxModelTestCase::test_photoCalibMean
3.30s call tests/test_astrometryTransform.py::AstrometryTransformPolynomialTestCase::testToAstMapOrder9
2.56s call tests/test_photometryModel.py::ConstrainedFluxModelTestCase::test_validate
=========================== short test summary info ============================
FAILED tests/test_star.py::TestProperMotion::test_apply_many - AssertionError…
=========== 1 failed, 168 passed, 39 skipped, 84 warnings in 24.43s ============
Global pytest run: failed with 1
Failed test output:
Global pytest output is in /home/valerio/SCRIPTS/LSST/lsst_track/stack/miniconda3-py38_4.9.2-4.1.0/EupsBuildDir/Linux64/jointcal-ge7491f621d+a8fda97056/jointcal-ge7491f621d+a8fda97056/tests/.tests/pytest-jointcal.xml.failed
The following tests failed:
/home/valerio/SCRIPTS/LSST/lsst_track/stack/miniconda3-py38_4.9.2-4.1.0/EupsBuildDir/Linux64/jointcal-ge7491f621d+a8fda97056/jointcal-ge7491f621d+a8fda97056/tests/.tests/pytest-jointcal.xml.failed
1 tests failed
scons: *** [checkTestStatus] Error 1
scons: building terminated because of errors.
The failure is a test showing a small numeric difference. If you’re in the conda rubin-env version 4.1, there shouldn’t be numeric differences at that level (I hope!).
I could certainly extend the valid range of the test, but I want to know more about why it’s failing on your system instead of any of ours. What kind of hardware are you running on?
Thank you for your message, and sorry for the late reply. Yesterday was a local holiday here in Brazil. Below there is information on the machine I have been trying to install the software on, please let me know if additional data is needed.
Inter processor INTEL I7-11700F
Memory DDR4 16GB 3200MHZ NETCORE
Motherboard Z590M GAMING X
SSD M2 NVME 500GB CRUCIAL
The problem here is the astropy result being different (2.0561995639654462 versus a USDF rubin-devl result of 2.0561999913228215) while the Science Pipelines result is the same (2.0561999914861264). This looks like it might be another version of [DM-32487] fix compilation for osx-arm64 for jointcal - Jira, although I’m not sure why the results should differ so much given that both systems are Intel and running Astropy 5.1.
Looking at the ticket, and booting this back up in my brain, the problem before was in erfa.pmsafe() which is called by astropy. But I have no explanation for the differences.
I think newinstall will not give binaries for non-CentOS/macOS unless forced (via manual setting of EUPS_PKGROOT). lsstinstall should automatically select binaries for anything claiming to be Linux. (It does look like lsstinstall over-aggressively selects macOS binaries for any other x86_64 machine that is not Linux.)
Which means that newinstall.sh configured our environment to not use binary packages. Could you please further elaborate on how to manual set the EUPS_PKGROOT parameter?
In a previous post, you mention the possibility of extending the valid range of the test for installing jointcal. Could you please elaborate on that? Whatever we tried so far did not work, perhaps it could be worth a shot.
The failed test occurs for the routine tests/test_star.py, perhaps increasing the tolerance parameter there could solve the problem?
Is there a way to do that also running the command:
You would have to do an install using lsstsw with a branch of jointcal modified to relax the tolerance on that test.
I’m still concerned about why you’re seeing a difference on similar hardware to what we’re all running; the conda environment should get you an identical compiler/astropy/ERFA version. The only numeric differences we’ve seen have been on the new macOS ARM processors.
Thank you for your latest post. I follow your advice: I modified the jointcal test test_star.py code by relaxing the tolerance in the subroutine test_apply_many, lines 134-135, from rtol=1e-7 to 1e-6, and it worked.
I run the demo according to the instructions at the lsstw demo page and the installation passed without errors.
I understand that this is not an ideal solution, and my colleagues and I are still looking into other possible installation methods, like the use of containers in Linux ubuntu for other operating system. But, at least, this could give us the opportunity to start familiarizing with the LSST scientific pipelines.
Also, we hope that this discussion could be useful for other users that may encounter the same problem.