Bug#913274: Incorrectly parsing whitespace in Sources.iter_paragraphs

Stuart Prescott stuart at debian.org
Wed Nov 14 05:17:04 GMT 2018


Hi Marcus,

> I've narrowed down where the issue occurs. It happens when passing the
> contents rather than the file handle to iter_paragraphs:
> 
> ~# ipython3
> Python 3.5.3 (default, Jan 19 2017, 14:11:04)
> Type "copyright", "credits" or "license" for more information.
> 
> IPython 5.1.0 -- An enhanced Interactive Python.
> ?         -> Introduction and overview of IPython's features.
> %quickref -> Quick reference.
> help      -> Python's own help system.
> object?   -> Details about 'object', use 'object??' for extra details.
> 
> In [1]: from debian.deb822 import Packages
> 
> In [2]: with open('Packages') as fh:
>   ...:    for p in Packages.iter_paragraphs(fh.read()):
>   ...:        if 'version' not in p:
>   ...:            print(p)
>   ...:
> Homepage: https://code.visualstudio.com/
[...]
> Passing the contents does the correct thing in all other cases, so not
> sure why it would be having an issue with this?

Ahah!

TagFile only accepts filehandles, not static data:

https://salsa.debian.org/apt-team/python-apt/blob/master/python/tag.cc#L750

In deb822.py there is a function _is_real_file() and that is used so that 
python-apt's TagFile is only invoked on filehandles and not on text data, 
diverting to the in-built parser when TagFile cannot be used.

BTW if you are read()ing so that you can deal with the compressed Pacakges.gz, 
TagFile can handle on-the-fly decompression.

In [1]: from debian.deb822 import Packages

In [2]: with open('Packages.gz') as fh:
   ...:     for p in Packages.iter_paragraphs(fh):
   ...:         if 'version' not in p:
   ...:             print(p)

(wild guess as to why you might be doing this!)

I've been thinking that in cases where iter_paragraphs was called with 
use_apt_pkg=True and then apt_pkg is not used contrary to what was requested, 
iter_paragraphs should generate a warning. That risks becoming noisy in a way 
that is not desirable, but also perhaps gets us away from this ambiguous 
behaviour where the use_apt_pkg setting has been ignored. 

I wonder what the likelihood is that introducing a warning would break someone 
else's code? (It would break an autopkgtest, for instance, by writing to 
stderr)

Cheers
Stuart

-- 
Stuart Prescott    http://www.nanonanonano.net/   stuart at nanonanonano.net
Debian Developer   http://www.debian.org/         stuart at debian.org
GPG fingerprint    90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7



More information about the pkg-python-debian-maint mailing list