[Python-modules-team] Bug#620551: python-pp: binary modules incl. numpy cannot be imported in pp scripts

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Sat Apr 2 16:54:30 UTC 2011


Subject: python-pp:  binary modules incl. numpy cannot be imported in pp scripts
Package: python-pp
Version: 1.6.0-1
Severity: important

*** Please type your report below this line ***
Importing numpy or scipy in pp worker process fails, because sys.path contains an invalid
directory (/usr/share/pyshared). This directory is added to sys.path by python itself,
because it contains the worker script executed with 'python -u /usr/share/pyshared/ppworker.py'.

One just cannot run a script from /usr/share/pyshared, because the parent
directory of the script is added to sys.path. The problem pops up when this
directory contains
- subdirectories which look like python modules, i.e. contain __init__.py
- files which look like python modules, i.e. have filenames which end in .py
In case of numpy, the installation is split between /usr/share/pyshared and /usr/lib/pyshared/python2.*
and symlinked into /usr/lib/pymodules/python2.6/numpy. Adding the first
of those directories into sys.path means that the binary parts cannot be
imported.

Packaging changed between python-pp versions on debian, and previously a separate
directory was used (/usr/share/python-support/python-pp), but now everything goes
into /usr/share/pyshared (/usr/share/pyshared/pp.py, /usr/share/pyshared/ppworker.py,
...). This packaging change seems to me to be a step backwards, since it's much
easier to get conflicts, but anyway it break parallel python which runs
os.path.abspath on /usr/lib/pymodules/python2.6/ppworker.py and gets /usr/share/pyshared/ppworker.py,
which is then executed and python pollutes sys.path.

This also means that one should not run a script from any publicly writable
directory, e.g. /tmp.

One simple solution would be to run the script under the symlink path /usr/lib/pymodules/python2.6/ppworker.py.
Then /usr/lib/pymodules/python2.6/ would be added (again) to sys.path, which is harmless.

A better solution would be to patch python so that the parent directory is not added
to sys.path when python is not running in interactive mode. This would also fix the problem
with scripts in /tmp.

$ cat /tmp/print_path.py 
import sys, os

print 'sys.path', sys.path
print 'PYTHONPATH', os.environ.get("PYTHONPATH", 'empty')

$ python /tmp/print_path.py 
sys.path ['/tmp', '/usr/local/lib/python2.6/dist-packages/TracGit-0.12.0.2dev_r7757-py2.6.egg', '/usr/local/lib/python2.6/dist-packages/Trac-0.12-py2.6.egg', '/usr/local/lib/python2.6/dist-packages/TracAccountManager-0.2.1dev_r4679-py2.6.egg', '/usr/local/lib/python2.6/dist-packages/pyopencl-0.92-py2.6-linux-x86_64.egg', '/usr/local/lib/python2.6/dist-packages/pytools-10-py2.6.egg', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/local/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode']
PYTHONPATH empty

$ cat numpy_pp_example.py
import pp, numpy, sys
print 'numpy', numpy

def job(x): return x

job_server = pp.Server(1, ())
job1 = job_server.submit(job, (1,), (), ('numpy',))
result = job1()
print result

$ python numpy_pp_example.py 
numpy <module 'numpy' from '/usr/lib/pymodules/python2.6/numpy/__init__.pyc'>
An error has occured during the module import
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.6/ppworker.py", line 55, in preprocess
    globals()[module.split('.')[0]] = import_module(module)
  File "/usr/lib/pymodules/python2.6/ppworker.py", line 43, in import_module
    mod = __import__(name)
  File "/usr/share/pyshared/numpy/__init__.py", line 132, in <module>
    import add_newdocs
  File "/usr/share/pyshared/numpy/add_newdocs.py", line 9, in <module>
    from lib import add_newdoc
  File "/usr/share/pyshared/numpy/lib/__init__.py", line 4, in <module>
    from type_check import *
  File "/usr/share/pyshared/numpy/lib/type_check.py", line 8, in <module>
    import numpy.core.numeric as _nx
  File "/usr/share/pyshared/numpy/core/__init__.py", line 5, in <module>
    import multiarray
ImportError: No module named multiarray
1



-- System Information:
Debian Release: squeeze/sid
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'unstable'), (101, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.37-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages python-pp depends on:
ii  python                  2.6.6-3+squeeze5 interactive high-level object-orie
ii  python-support          1.0.10           automated rebuilding support for P

python-pp recommends no packages.

python-pp suggests no packages.

-- no debconf information



*** Please type your report below this line ***
Importing numpy or scipy in pp worker process fails, because sys.path
contains an invalid directory (/usr/share/pyshared). This directory is
added to sys.path by python itself, because it is the parent directory
of worker script executed with 'python -u /usr/share/pyshared/ppworker.py'.

One just cannot run a script from /usr/share/pyshared, because the parent
directory of the script is added to sys.path. The problem pops up when this
directory contains
- subdirectories which look like python modules, i.e. contain __init__.py
- files which look like python modules, i.e. have filenames which end in .py
In case of numpy, the installation is split between /usr/share/pyshared and /usr/lib/pyshared/python2.*
and symlinked into /usr/lib/pymodules/python2.6/numpy. Adding the first
of those directories into sys.path means that the binary parts cannot be
imported.

Packaging changed between python-pp versions on debian, and previously
a separate directory was used (/usr/share/python-support/python-pp),
but now everything goes into /usr/share/pyshared
(/usr/share/pyshared/pp.py, /usr/share/pyshared/ppworker.py,
...). This packaging change seems to me to be a step backwards, since
it's much easier to get conflicts, but anyway it break parallel python
which runs os.path.abspath on /usr/lib/pymodules/python2.6/ppworker.py
and gets /usr/share/pyshared/ppworker.py, which is then executed and
python pollutes sys.path.

This also means that one should not run a script from any publicly writable
directory, e.g. /tmp.

One simple solution would be to run the script under the symlink path
/usr/lib/pymodules/python2.6/ppworker.py.  Then
/usr/lib/pymodules/python2.6/ would be added (again) to sys.path,
which is harmless.

A better solution would be to patch python so that the parent
directory is not added to sys.path when python is not running in
interactive mode. This would also fix the problem with scripts in
/tmp. But maybe it is a step too big :)

------------------------------------------------------------------
Verify that parent directory is added to sys.path:

$ cat /tmp/print_path.py 
import sys, os

print 'sys.path', sys.path
print 'PYTHONPATH', os.environ.get("PYTHONPATH", 'empty')

$ python /tmp/print_path.py 
sys.path ['/tmp', '/usr/local/lib/python2.6/dist-packages/TracGit-0.12.0.2dev_r7757-py2.6.egg', '/usr/local/lib/python2.6/dist-packages/Trac-0.12-py2.6.egg', '/usr/local/lib/python2.6/dist-packages/TracAccountManager-0.2.1dev_r4679-py2.6.egg', '/usr/local/lib/python2.6/dist-packages/pyopencl-0.92-py2.6-linux-x86_64.egg', '/usr/local/lib/python2.6/dist-packages/pytools-10-py2.6.egg', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/local/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode']
PYTHONPATH empty

------------------------------------------------------------------
How pp cannot import numpy:
$ cat numpy_pp_example.py
import pp, numpy, sys
print 'numpy', numpy

def job(x): return x

job_server = pp.Server(1, ())
job1 = job_server.submit(job, (1,), (), ('numpy',))
result = job1()
print result

$ python numpy_pp_example.py 
numpy <module 'numpy' from '/usr/lib/pymodules/python2.6/numpy/__init__.pyc'>
An error has occured during the module import
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.6/ppworker.py", line 55, in preprocess
    globals()[module.split('.')[0]] = import_module(module)
  File "/usr/lib/pymodules/python2.6/ppworker.py", line 43, in import_module
    mod = __import__(name)
  File "/usr/share/pyshared/numpy/__init__.py", line 132, in <module>
    import add_newdocs
  File "/usr/share/pyshared/numpy/add_newdocs.py", line 9, in <module>
    from lib import add_newdoc
  File "/usr/share/pyshared/numpy/lib/__init__.py", line 4, in <module>
    from type_check import *
  File "/usr/share/pyshared/numpy/lib/type_check.py", line 8, in <module>
    import numpy.core.numeric as _nx
  File "/usr/share/pyshared/numpy/core/__init__.py", line 5, in <module>
    import multiarray
ImportError: No module named multiarray
1

-- System Information:
Debian Release: squeeze/sid
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'unstable'), (101, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.37-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages python-pp depends on:
ii  python                  2.6.6-3+squeeze5 interactive high-level object-orie
ii  python-support          1.0.10           automated rebuilding support for P

python-pp recommends no packages.

python-pp suggests no packages.

-- no debconf information





More information about the Python-modules-team mailing list