Bug#438909: [debfile] DebControl.md5sums(): unsafe parsing of md5sums file

Romain Francoise rfrancoise at debian.org
Mon Aug 20 17:15:40 UTC 2007


Package: python-debian
Version: 0.1.5
Tags: patch

The 'md5sums' method of the DebControl class assumes that the
package's md5sums file has exactly two whitespace-separated columns:

|        md5_file = self.get_file(MD5_FILE)
|        sums = {}
|        for line in md5_file.readlines():
|            md5, fname = line.split()       <---
|            sums[fname] = md5

This obviously cannot work if the filenames have embedded spaces in
them, and about 103 packages in the archive currently ship such
files.  The exception thrown is:

>>> from debian_bundle.debfile import DebFile
>>> DebFile('/tmp/1.deb').md5sums()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/var/lib/python-support/python2.4/debian_bundle/debfile.py", line 226, in md5sums
    return self.control.md5sums()
  File "/var/lib/python-support/python2.4/debian_bundle/debfile.py", line 170, in md5sums
    md5, fname = line.split()
ValueError: too many values to unpack
>>>

A safer approach is to make sure that we split the line only once,
getting two values no matter what the filename contains.  In that
case the newline must first be stripped manually otherwise it ends
up in 'fname'.  This is what the following patch does:

=== modified file 'debian_bundle/debfile.py'
--- debian_bundle/debfile.py	2007-08-20 10:48:08 +0000
+++ debian_bundle/debfile.py	2007-08-20 17:07:24 +0000
@@ -167,7 +167,7 @@
         md5_file = self.get_file(MD5_FILE)
         sums = {}
         for line in md5_file.readlines():
-            md5, fname = line.split()
+            md5, fname = line.rstrip("\n").split(None, 1)
             sums[fname] = md5
         md5_file.close()
         return sums

Thanks,

-- 
  ,''`.
 : :' :        Romain Francoise <rfrancoise at debian.org>
 `. `'         http://people.debian.org/~rfrancoise/
   `-




More information about the pkg-python-debian-maint mailing list