Bug#462859: debfile fails for packages using data.tar.bz2

Stefano Zacchiroli zack at debian.org
Wed Feb 20 16:24:34 UTC 2008


On Wed, Feb 06, 2008 at 08:44:05PM +0100, Filippo Giunchedi wrote:
> I came up with the attached patch, it works but I'll give Zack a change to
> comment and/or adapt before commit.

Nope, that patch will only avoid failures during the initialization of a
DebFile object, but it won't work for actually accessing the content of
data.tar.bz2.

Anyhow, first things first, attached there is a modified version of
Filippo's patch which is a bit more generic in which parts can be
compressed and how they can be compressed. It also updates the
documentation of the affected methods. Such a patch can be applied to
fix the b0rken behaviour of scripts like dpkg-info, but that's it:
accessing data.tar.bz2 would not be possible (and will fail with an
exception). You can verify if with the proof of concept attached version
of dpkg-info, which tries to extract a file from the vim-runtime .deb
which as been pointed to in this bug report before (see the last line of
the attached dpkg-info).

The problem in fixing the access to data.tar.bz2 is that the Python
class BZ2File does not work as GZFile: in particular it does not accept
a file object as input, but rather requires a file*name*. So I see 2
solutions:

1) save on a temporary file the .tar.bz2 and read it using BZ2File, of
   course remembering to delete the temporary file when done. Will
   introduce the risk of leaving garbage around, but the code will be
   (as much as possible) uniform with what we have now

2) use some other facilities of the bz2 module to decompress in memory
   on the fly. One is BZ2Decompressor (which is an incremental
   decompressor), the other is bz2.decompress (which is one shot). The
   problem with both of them is that neither of the 2 implement a
   file-like interface, so we need either to do that by ourselves, or to
   change the code, rather heavily I presume, for being able to cope
   with that

(If my life was threatened, I would probably choose (1) among the two.)

3) A third (non-)solution would be to commit the patch write away, and
   return an error when trying to access the content of data.tar.bz2. At
   least all usages of .deb which only deal with .deb metadata and not
   with the actual content would work again ...

Go ahead, starting from my attached patch, with a proper error message
and a commit if you want (3).
Sooner or later I'll probably implement (1).

Cheers.

-- 
Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
zack@{upsilon.cc,cs.unibo.it,debian.org}  -<%>-  http://upsilon.cc/zack/
(15:56:48)  Zack: e la demo dema ?    /\    All one has to do is hit the
(15:57:15)  Bac: no, la demo scema    \/    right keys at the right time
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debfile-bzip2.patch
Type: text/x-diff
Size: 5094 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-python-debian-maint/attachments/20080220/39a6bb9d/attachment.patch 
-------------- next part --------------
#!/usr/bin/python

# dpkg-info - DebFile's implementation of "dpkg --info"
# Copyright (C) 2007 Stefano Zacchiroli <zack at debian.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.


""" (An approximation of) a 'dpkg --info' implementation relying on DebFile
class. """

import os
import stat
import string
import sys

from debian_bundle import debfile

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print "Usage: dpkg-info DEB"
        sys.exit(1)
    fname = sys.argv[1]

    deb = debfile.DebFile(fname)
    if deb.version == '2.0':
        print ' new debian package, version %s.' % deb.version
    print ' size %d bytes: control archive= %d bytes.' % (
            os.stat(fname)[stat.ST_SIZE], deb['control.tar.gz'].size)
    for fname in deb.control:   # print info about control part contents
        content = deb.control[fname]
        if not content:
            continue
        lines = content.split('\n')
        ftype = ''
        try:
            if lines[0].startswith('#!'):
                ftype = lines[0].split()[0]
        except IndexError:
            pass
        print '  %d bytes, %d lines, %s, %s' % (len(content), len(lines),
                fname, ftype)
    for n, v in deb.debcontrol().iteritems(): # print DEBIAN/control fields
        if n.lower() == 'description':  # increase indentation of long dsc
            lines = v.split('\n')
            shortDsc = lines[0]
            longDsc = string.join(map(lambda l: ' ' + l, lines[1:]), '\n')
            print ' %s: %s\n%s' % (n, shortDsc, longDsc)
        else:
            print ' %s: %s' % (n, v)
    print deb.data.get_content('/usr/share/vim/vim71/doc/diff.txt')

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/pkg-python-debian-maint/attachments/20080220/39a6bb9d/attachment.pgp 


More information about the pkg-python-debian-maint mailing list