Bug#911830: FTBFS on multiple architectures

Stuart Prescott stuart at debian.org
Mon Dec 3 09:01:01 GMT 2018


Hi Yaroslav,

Thanks for the update!

I see that, as well as a possible race conditions in the build system, we've 
definitely got a race condition in me sending emails -- I missed that you had 
uploaded 0.20.1+dfsg-1 :) That probably makes my previous comments somewhat 
cryptic.

(To explain: I had been looking at why 0.20.0+dfsg-2 wasn't building. Looking 
through all the build-dependencies that had changed since the 0.20.0 upload 
made me think that the FTBFS was a regression since your upload, particularly 
since I couldn't build it on i386 or amd64 locally either, that's what led to 
'everywhere'.)

So... Let's start again knowing that you (and the buildd) can indeed build the 
packages and that it is not a question of changes to build-dependencies.

As I poke the 0.20.1 package some more, it looks like a substantial portion of 
the problem is the parallelisation:

- if I build with -j1, it fails to even try to build any of the Python 3.x 
versions

- if I build with -j2, it fails to even try to build the Python 3.7 version

- if I build with -j4, it mostly succeeds; it also sometimes fails because the 
config steps for each build is built at the same time and is trying to touch 
the ame files, e.g.:

i686-linux-gnu-gcc -pthread _configtest.o -L/usr/lib/i386-linux-gnu -lf77blas 
-lcblas -latlas -o _configtest
/bin/sh: 1: ./_configtest: Text file busy

(the 2.7 build failed in that particular case). 

It looks like setting up the parallel builds is pretty racy and under some 
circumstances it manages to succeed, unless you happen to not have enough jobs 
available or if they happen to be too quick.

I'm not really sure where to take this next. Perhaps d/rules is fixable 
easily, or perhaps ripping out a pile of stuff is necessary. Hopefully those 
who understand how it is supposed to work can help here.


> > There's no visible progress on this problem in git -- is there progress
> > elsewhere?
> 
> you could find some traces of the progress which lead to i386 fixes on
> https://github.com/scikit-learn/scikit-learn/issues?utf8=%E2%9C%93&q=is%3Ais
> sue+author%3Ayarikoptic+
> 
> I welcome you to review/analyze failures on other platforms and/or just
> report them upstream.  or I would do it whenever I get a chance

Thanks! I'll see what I can learn, although I think at least some of the 
current problems are actually with the Debian packaging and not upstream.

> s390x (and also arm64) issue is here upstream
> https://github.com/scikit-learn/scikit-learn/issues/10561
> 
> if you care to help, please try to figure out WTF with ppc64el:
> https://buildd.debian.org/status/fetch.php?pkg=scikit-learn&arch=ppc64el&ver
> =0.20.1%2Bdfsg-1&stamp=1543512601&raw=0 where it even doesn't build... might
> be a cython issue

That failure looks the same as what I see on amd64 and i386.

It's excessively difficult to extract the right lines from the parallel build, 
but I think the final failure is the same both with -j1 or -j2:

////////////
PYBUILD_INTERPRETERS=python{version} PYBUILD_VERSIONS=3.6 dh build-arch --with 
python3 --buildsystem pybuild
touch debian/build-stamp-python3.6
PYBUILD_INTERPRETERS=python{version} PYBUILD_VERSIONS=3.7 dh build-arch --with 
python3 --buildsystem pybuild
touch debian/build-stamp-python3.7
:
# urllib.error.URLError -- have to ignore
# hotfix SPHINXOPTS -- remove in next release
\
        PYTHONPATH=`/bin/ls -d /<<BUILDDIR>>/scikit-
learn-0.20.1+dfsg/.pybuild/*python*_3.7/build`:$(python3 -c 'import 
sys;print(":".join(sys.path))') \
        SPHINXBUILD="python3.7 -m sphinx -j 1 -D mathjax_path=MathJax.js" \
        SPHINXOPTS="-j 1 -D mathjax_path=MathJax.js" \
        /usr/bin/make -C doc html
/bin/ls: cannot access '/<<BUILDDIR>>/scikit-learn-0.20.1+dfsg/.pybuild/
*python*_3.7/build': No such file or directory
make[2]: Entering directory '/<<BUILDDIR>>/scikit-learn-0.20.1+dfsg/doc'
# These two lines make the build a bit more lengthy, and the
# the embedding of images more robust
rm -rf _build/html/_images
#rm -rf _build/doctrees/
python3.7 -m sphinx -j 1 -D mathjax_path=MathJax.js -b html -d _build/doctrees    
. _build/html/stable
Running Sphinx v1.7.9

Configuration error:
There is a programable error in your configuration file:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/sphinx/config.py", line 161, in 
__init__
    execfile_(filename, config)
  File "/usr/lib/python3/dist-packages/sphinx/util/pycompat.py", line 150, in 
execfile_
    exec_(code, _globals)
  File "conf.py", line 18, in <module>
    from sklearn.externals.six import u
ModuleNotFoundError: No module named 'sklearn'
////////////


(i.e. the build is not attempted and the build does not fail because of that 
error, but fails later because building the documentation fails)

Cheers
Stuart




-- 
Stuart Prescott    http://www.nanonanonano.net/   stuart at nanonanonano.net
Debian Developer   http://www.debian.org/         stuart at debian.org
GPG fingerprint    90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7



More information about the debian-science-maintainers mailing list