Bug#925294: Does not work without extra downloads

Enrico Zini enrico at debian.org
Fri Mar 22 15:18:53 GMT 2019


Package: python3-nltk
Version: 3.4-1
Severity: normal

Hello,

I tried to use nltk for simple work tokenization, but it fails:

>>> import nltk
>>> nltk.word_tokenize("foo")
…
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/home/enrico/nltk_data'
    - '/usr/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''
**********************************************************************

I am extremely reluctant to run unreviewed code that downloads random
data from the internet in some unspecified way, and does unspecified
things with it, to the point that I decided to give up using the library
altogether.

It would have been an entirely different story if the datasets that nltk
needs were also packaged in Debian, so that it could have worked out of
the box.


Enrico


-- System Information:
Debian Release: buster/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-2-amd64 (SMP w/4 CPU cores)
Kernel taint flags: TAINT_WARN, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_IE.UTF-8, LC_CTYPE=en_IE.UTF-8 (charmap=UTF-8), LANGUAGE=en_IE:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages python3-nltk depends on:
ii  python3      3.7.2-1
ii  python3-six  1.12.0-1

Versions of packages python3-nltk recommends:
ii  prover9        0.0.200911a-2.1+b2
ii  python3-numpy  1:1.16.1-1
ii  python3-tk     3.7.2-3

python3-nltk suggests no packages.

-- no debconf information


More information about the debian-science-maintainers mailing list