Bug#556555: python-xml: Round-tripping an XML document wantonly messes up blanks
Edward Welbourne
eddy at opera.com
Mon Nov 16 18:30:14 UTC 2009
Package: python-xml
Version: 0.8.4-10.1
Severity: minor
Save the following SVG to a file <file>
<?xml version="1.0" encoding="utf-8"?><!--*- Mode: sgml; coding: utf-8; tab-width: 5; -*-->
<!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1 Tiny//EN'
'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11-tiny.dtd'>
<svg baseProfile="tiny" version="1.1" xml:lang="en-GB"
viewBox="-90 0 3920 4100"
xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<title>Test-case to illustrate expat issue</title>
<desc>
Parse this with an xml parser then ask the parsed object to
serialise itself as text. The result is not what you started with.
It would be nice if it were.
</desc>
<path d="M0,1450
L30,1560
L3920,2370"
id="high" stroke="blue" />
</svg>
</file> then run
>>> from xml.dom.expatbuilder import parse
>>> dom = parse('path/to/the/file.svg')
>>> print dom.toxml('utf-8')
The result has a line-break before the emacs mode-line comment; this
prevents this comment from actually configuring what mode emacs shall
use to handle the resulting file - it only does its magic if it
appears on the first line of the file.
The result also joins the lines of the path element below, notably
those making up its d="..." attribute. In the real SVG on which this
test-case is based, the path records several hundred data points: it
is convenient and desirable to break the attribute's value into
convenient-sized lines. Not only does this make the source file
easier to read: it also localizes the changes, when I add new
data-points to the graph; this, in turn, ensures that version-control
software correctly reports sensible diffs, that make it easy to see
the additions, instead of merely recording that the entire attribute
has changed. (Actual additions to the d attribute are made using a
script which uses the above code but manipulates the dom object
between parsing and output, to find the d attribute and add the new
data to it. Having the text changed as described here, rather than
only having the data added, forces me to clean up the result later.)
I have verified that these issues are present in python-xml-0.8.4-10.1.
-- System Information:
Debian Release: squeeze/sid
APT prefers testing
APT policy: (500, 'testing'), (500, 'stable')
Architecture: i386 (i686)
Kernel: Linux 2.6.30-2-686 (SMP w/2 CPU cores)
Locale: LANG=en_GB.ISO-8859-15, LC_CTYPE=en_GB.ISO-8859-15 (charmap=ISO-8859-15)
Shell: /bin/sh linked to /bin/bash
Versions of packages python-xml depends on:
ii libc6 2.10.1-5 GNU C Library: Shared libraries
ii python 2.5.4-2 An interactive high-level object-o
ii python-central 0.6.12+nmu1 register and build utility for Pyt
python-xml recommends no packages.
Versions of packages python-xml suggests:
pn python-xml-dbg <none> (no description available)
ii python-xml-doc 0.8.4-10.1 XML tools for Python (documentatio
-- no debconf information
More information about the pkg-zope-developers
mailing list