[xml/sgml-pkgs] Bug#509967: libxml2 should accept any character in anyURI data (with Relax NG)
Vincent Lefevre
vincent at vinc17.org
Sun Dec 28 02:15:27 UTC 2008
Package: libxml2
Version: 2.6.32.dfsg-5
Severity: normal
When validating a file with Relax NG, libxml2 doesn't accept some special
characters (e.g. the space and non-ASCII characters) in anyURI data, though
the specs allow them. Indeed http://www.w3.org/TR/xmlschema-2/#anyURI says:
--------
3.2.17.1 Lexical representation
The ·lexical space· of anyURI is finite-length character sequences which,
when the algorithm defined in Section 5.4 of [XML Linking Language] is
applied to them, result in strings which are legal URIs according to
[RFC 2396], as amended by [RFC 2732].
Note: Spaces are, in principle, allowed in the ·lexical space· of
anyURI, however, their use is highly discouraged (unless they are
encoded by %20).
--------
The goal of the algorithm defined in Section 5.4 of [XML Linking Language]
is precisely to encode/escape these disallowed characters in order to form
a valid URI.
Well, there are some restrictions (RFC 2396 / RFC 2732). For instance,
a space must not appear in the scheme part.
Note that nXML mode for Emacs seems to be correct.
-- System Information:
Debian Release: 5.0
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.26.5-20080922 (SMP w/2 CPU cores; PREEMPT)
Locale: LANG=POSIX, LC_CTYPE=en_US.ISO8859-1 (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/bash
Versions of packages libxml2 depends on:
ii libc6 2.7-16 GNU C Library: Shared libraries
ii zlib1g 1:1.2.3.3.dfsg-12 compression library - runtime
Versions of packages libxml2 recommends:
ii xml-core 0.12 XML infrastructure and XML catalog
libxml2 suggests no packages.
-- no debconf information
More information about the debian-xml-sgml-pkgs
mailing list