Bug#913274: Incorrectly parsing whitespace in Sources.iter_paragraphs

Marcus Furlong furlongm at gmail.com
Wed Nov 14 04:47:28 GMT 2018


Control: retitle -1 Incorrectly parsing whitespace in Deb822.iter_paragraphs
On Tue, 13 Nov 2018 at 23:42, Marcus Furlong <furlongm at gmail.com> wrote:
>
> > > I have come across a case where whitespace is added in
> > > Packages{.gz,.bz2} and I am not sure how it should be parsed.
> > [...]
> > > Should this whitespace be parsed as a paragraph delimiter?
> >
> > For a Packages file, each paragraph is defined as a set of DEBIAN/control
> > paragraphs; the Description field is not allowed to contain lines that are
> > whitespace-only.
> >
> > https://wiki.debian.org/DebianRepository/Format#A.22Packages.22_Indices
> >
> > https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-description
> >
> > So the strict answer is yes, it should be a paragraph delimiter but most
> > implementations seem to be more forgiving in what they accept.
> >
> > Note that for debian/control files in source packages, whitespace-only lines
> > are treated as paragraph separators so that whitespace errors in an editor
> > don't accidentally make packages disappear from the archive.
> >
> >
> > > Currently, the whitespace is being treated as a paragraph delimiter,
> > > in python-debian, but not by apt-get, etc.
> >
> > Could you expand on this with an example, perhaps?
> >
> > python-debian actually uses python-apt for dealing with Sources and Packages
>
> I was incorrect. As you have shown, python-apt works correctly.
>
> > files (i.e. the exact same code as apt) and already does treat whitespace-only
> > lines as being part of a paragraph rather than breaking them:
> >
> >
> > $ ipython3
> > Python 3.6.7 (default, Oct 21 2018, 08:08:16)
> > Type "copyright", "credits" or "license" for more information.
> >
> > In [1]: from debian.deb822 import Packages
> >
> > In [2]: with open('Packages') as fh:
> >   ...:     for p in Packages.iter_paragraphs(fh):
> >   ...:         if p['Version'] == '1.25.0-1529904044':
> >   ...:             print(p)
>
> I've narrowed down where the issue occurs. It happens when passing the
> contents rather than the file handle to iter_paragraphs:
>
> ~# ipython3
> Python 3.5.3 (default, Jan 19 2017, 14:11:04)
> Type "copyright", "credits" or "license" for more information.
>
> IPython 5.1.0 -- An enhanced Interactive Python.
> ?         -> Introduction and overview of IPython's features.
> %quickref -> Quick reference.
> help      -> Python's own help system.
> object?   -> Details about 'object', use 'object??' for extra details.
>
> In [1]: from debian.deb822 import Packages
>
> In [2]: with open('Packages') as fh:
>   ...:    for p in Packages.iter_paragraphs(fh.read()):
>   ...:        if 'version' not in p:
>   ...:            print(p)
>   ...:
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
> Homepage: https://code.visualstudio.com/
>
>
> In [3]:
>
> Passing the contents does the correct thing in all other cases, so not
> sure why it would be having an issue with this?
>
> --
> Marcus Furlong



--
Marcus Furlong



More information about the pkg-python-debian-maint mailing list