Bug#556555: python-xml: Round-tripping an XML document wantonly messes up blanks

Edward Welbourne eddy at opera.com
Mon Nov 16 18:30:14 UTC 2009


Package: python-xml
Version: 0.8.4-10.1
Severity: minor

Save the following SVG to a file <file>

<?xml version="1.0" encoding="utf-8"?><!--*- Mode: sgml; coding: utf-8; tab-width: 5; -*-->
<!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1 Tiny//EN'
  'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11-tiny.dtd'>
<svg baseProfile="tiny" version="1.1" xml:lang="en-GB"
     viewBox="-90 0 3920 4100"
     xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  <title>Test-case to illustrate expat issue</title>
  <desc>
    Parse this with an xml parser then ask the parsed object to
    serialise itself as text.  The result is not what you started with.
    It would be nice if it were.
  </desc>
     <path d="M0,1450
              L30,1560
              L3920,2370"
           id="high" stroke="blue" />
</svg>

</file> then run

>>> from xml.dom.expatbuilder import parse
>>> dom = parse('path/to/the/file.svg')
>>> print dom.toxml('utf-8')

The result has a line-break before the emacs mode-line comment; this
prevents this comment from actually configuring what mode emacs shall
use to handle the resulting file - it only does its magic if it
appears on the first line of the file.

The result also joins the lines of the path element below, notably
those making up its d="..." attribute.  In the real SVG on which this
test-case is based, the path records several hundred data points: it
is convenient and desirable to break the attribute's value into
convenient-sized lines.  Not only does this make the source file
easier to read: it also localizes the changes, when I add new
data-points to the graph; this, in turn, ensures that version-control
software correctly reports sensible diffs, that make it easy to see
the additions, instead of merely recording that the entire attribute
has changed.  (Actual additions to the d attribute are made using a
script which uses the above code but manipulates the dom object
between parsing and output, to find the d attribute and add the new
data to it.  Having the text changed as described here, rather than
only having the data added, forces me to clean up the result later.)

I have verified that these issues are present in python-xml-0.8.4-10.1.

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.30-2-686 (SMP w/2 CPU cores)
Locale: LANG=en_GB.ISO-8859-15, LC_CTYPE=en_GB.ISO-8859-15 (charmap=ISO-8859-15)
Shell: /bin/sh linked to /bin/bash

Versions of packages python-xml depends on:
ii  libc6                        2.10.1-5    GNU C Library: Shared libraries
ii  python                       2.5.4-2     An interactive high-level object-o
ii  python-central               0.6.12+nmu1 register and build utility for Pyt

python-xml recommends no packages.

Versions of packages python-xml suggests:
pn  python-xml-dbg                <none>     (no description available)
ii  python-xml-doc                0.8.4-10.1 XML tools for Python (documentatio

-- no debconf information





More information about the pkg-zope-developers mailing list