mwv
(Michael Wood-Vasey)
September 14, 2017, 2:16pm
#1
When I use an updated lsstsw
to rebuild -u verify
, the build fails with flake8 errors.
When I build explicitly a locally checked-out copy, it’s fine.
I’ve read through DM-11822 and DM-11809 and tried to follow the discussion in #dm-square .
What is the proper behavior is supposed to be and what should be local configuration be to ensure this is correct.
mwv
(Michael Wood-Vasey)
September 14, 2017, 2:18pm
#2
@jsick How do I post an attachment to a community post? I’d like to post the build log and failed.xml.
timj
(Tim Jenness)
September 14, 2017, 2:27pm
#3
A snippet of the end of .failed
file will be fine. I was able to build verify
using lsstsw
yesterday so this will be interesting.
mwv
(Michael Wood-Vasey)
September 14, 2017, 2:29pm
#4
============================= test session starts ==============================
platform darwin -- Python 2.7.12, pytest-3.2.0, py-1.4.34, pluggy-0.4.0
rootdir: /Volumes/PS1/lsstsw/build/verify, inifile: setup.cfg
plugins: session2file-0.1.9, xdist-1.19.2.dev0+g459d52e.d20170907, forked-0.3.dev0+g1dd93f6.d20170907, flake8-0.8.1
gw0 I / gw1 I / gw2 I / gw3 I / gw4 I / gw5 I / gw6 I / gw7 I / gw8 I / gw9 I / gw10 I / gw11 I / gw12 I / gw13 I / gw14 I / gw15 I
gw0 [533] / gw1 [533] / gw2 [533] / gw3 [533] / gw4 [533] / gw5 [533] / gw6 [533] / gw7 [533] / gw8 [533] / gw9 [533] / gw10 [533] / gw11 [533] / gw12 [533] / gw13 [533] / gw14 [533] / gw15 [533]
scheduling tests via LoadScheduling
........................................F..FF..FFFF....................................................................................................................F....F.......................F......F................................s.........s.......................F..F...F.F......ss.F.s...s...............F.........................................F.............................................................F....F......................................................................................s.........s....s..ss..s...
generated xml file: /Volumes/PS1/lsstsw/build/verify/tests/.tests/pytest-verify.xml
=================================== FAILURES ===================================
_____________ FLAKE8-check(ignoring E133 E226 E228 N802 N803 N806) _____________
[gw10] darwin -- Python 2.7.12 /Users/wmwv/lsstsw/miniconda/bin/python
build/bdist.macosx-10.6-x86_64/egg/pytest_flake8.py:115: in runtest
???
/Users/wmwv/lsstsw/miniconda/lib/python2.7/site-packages/py/_io/capture.py:150: in call
res = func(*args, **kwargs)
build/bdist.macosx-10.6-x86_64/egg/pytest_flake8.py:187: in check_file
???
build/bdist.macosx-10.6-x86_64/egg/flake8/main/application.py:229: in make_file_checker_manager
???
build/bdist.macosx-10.6-x86_64/egg/flake8/checker.py:89: in __init__
???
/Users/wmwv/lsstsw/miniconda/lib/python2.7/multiprocessing/__init__.py:232: in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
/Users/wmwv/lsstsw/miniconda/lib/python2.7/multiprocessing/pool.py:159: in __init__
self._repopulate_pool()
/Users/wmwv/lsstsw/miniconda/lib/python2.7/multiprocessing/pool.py:223: in _repopulate_pool
w.start()
/Users/wmwv/lsstsw/miniconda/lib/python2.7/multiprocessing/process.py:130: in start
self._popen = Popen(self)
/Users/wmwv/lsstsw/miniconda/lib/python2.7/multiprocessing/forking.py:121: in __init__
self.pid = os.fork()
E OSError: [Errno 35] Resource temporarily unavailable
mwv
(Michael Wood-Vasey)
September 14, 2017, 2:33pm
#5
E OSError: [Errno 35] Resource temporarily unavailable
seems to be the key line.
timj
(Tim Jenness)
September 14, 2017, 2:59pm
#6
Yes. Do you have 16 cores?
jsick
(Jonathan Sick)
September 14, 2017, 3:30pm
#8
You can drag the file into your editing window. If xml
isn’t whitelisted (yet) you might want to add a .txt
extension.
mwv
(Michael Wood-Vasey)
September 14, 2017, 3:35pm
#9
Thanks. “Upload” was what I was looking for. I understand now that attachments are associated with a comment, not the thread (opposite to JIRA tickets).
timj
(Tim Jenness)
September 14, 2017, 3:36pm
#10
mwv:
8 real cores.
But I am guessing 16 cores if you query the operating system. That means that scons
is doing the “right” thing. When you say your build succeeds when you do a local checkout I am guessing that’s because you are doing the build with scons
and not scons -j16
…
You might have to set $EUPSPKG_NJOBS=8
to get the build to work if your machine has too many cores to be usable.
Does it look like it’s running out of processes or RAM?
josh
(josh)
September 14, 2017, 3:50pm
#11
It could be a low ulimit for processes or open files. ulimit -a
will list all user limits.
timj
(Tim Jenness)
September 14, 2017, 3:56pm
#12
@mwv does pytest -n 16
(if run in the verify
dir) also fail for you?
mwv
(Michael Wood-Vasey)
September 28, 2017, 6:00pm
#13
Setting
export $EUPSPKG_NJOBS=8
succeeds.
I just ran into this again with meas_base
this morning.
timj
(Tim Jenness)
September 29, 2017, 3:24pm
#14
I don’t really know what to say. It doesn’t seem like it’s because the tests themselves use lots of resources. I just ran a 16 worker test with verify and didn’t even see the memory usage blip. What does ulimit -a
say for you?
I get:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1418
virtual memory (kbytes, -v) unlimited
mwv
(Michael Wood-Vasey)
October 2, 2017, 12:47pm
#15
[serenity ~] ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 7168
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited