Bug#604093: python-debian: iter_paragraphs: be more robust against RFC822 comments in unicoded files -- swallows the trailer
Yaroslav Halchenko
debian at onerussian.com
Sat Nov 20 02:52:03 UTC 2010
Package: python-debian
Version: 0.1.18
Severity: normal
If there is a comment (all legit if we follow rfc822 format, but might not be in deb822),
$> cat confuse.txt
Goodone: value0
; xxx might be unicode: Ярик
Entry: value
$> python -c "import codecs; from debian import deb822; print list(codecs.open('confuse.txt', encoding='utf-8'))"
[u'Goodone: value0\n', u'\n', u' ; xxx might be unicode: \u042f\u0440\u0438\u043a\n', u'\n', u'Entry: value\n']
$> python -c "import codecs; from debian import deb822; print list(deb822.Deb822.iter_paragraphs(codecs.open('confuse.txt', encoding='utf-8')))"
[{'Goodone': u'value0'}]
$> python -c "import codecs; from debian import deb822; print list(deb822.Deb822.iter_paragraphs(codecs.open('confuse.txt')))"
[{'Goodone': u'value0'}, {}, {'Entry': u'value'}]
Interestingly enough if I remove space before Entry:, it manages to do
fine:
$> python -c "import codecs; from debian import deb822; print list(deb822.Deb822.iter_paragraphs(codecs.open('confuse.txt', encoding='utf-8')))"
[{'Goodone': u'value0'}, {'Entry': u'value'}]
or may be comments shouldn't be detached from paragraphs according to rfc822?
(in any case the divergence between unicode/plain handling is
sub-optimal)
-- System Information:
Debian Release: squeeze/sid
APT prefers testing
APT policy: (900, 'testing'), (800, 'unstable'), (300, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages python-debian depends on:
ii python 2.6.6-3+squeeze1 interactive high-level object-orie
ii python-chardet 2.0.1-1 universal character encoding detec
ii python-support 1.0.10 automated rebuilding support for P
Versions of packages python-debian recommends:
ii python-apt 0.7.98.1 Python interface to libapt-pkg
Versions of packages python-debian suggests:
ii gpgv 1.4.10-4 GNU privacy guard - signature veri
-- no debconf information
More information about the pkg-python-debian-maint
mailing list