[xml/sgml-pkgs] Bug#763598: docbook-xml: xmllint fails to identify local copy of docbook entities file
Raphael Hertzog
hertzog at debian.org
Wed Oct 1 07:58:01 UTC 2014
Package: docbook-xml
Version: 4.5-7.2
Severity: important
Consider the test document attached, it's starting with this:
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE section [
<!ENTITY % BOOK_ENTITIES SYSTEM "Users_Guide.ent">
%BOOK_ENTITIES;
<!ENTITY % sgml.features "IGNORE">
<!ENTITY % xml.features "INCLUDE">
<!ENTITY % DOCBOOK_ENTS PUBLIC "-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" "http://www.oasi
s-open.org/docbook/xml/4.5/dbcentx.mod">
%DOCBOOK_ENTS;
]>
Now I want to parse it (with publican which uses libxml internally) but I always ends
up loading http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod from the network instead
of finding the local copy. I can reproduce the problem with xmllint:
$ XML_DEBUG_CATALOG=1 xmllint --debugent --nonet --noent --noout test.xml
[...]
Resolve: pubID -//OASIS//ENTITIES DocBook Character Entities V4.5//EN sysID http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
0 Parsing catalog file:///etc/xml/catalog
file:///etc/xml/catalog added to file hash
file:///etc/xml/docbook-xml.xml not found in file hash
0 Parsing catalog file:///etc/xml/docbook-xml.xml
file:///etc/xml/docbook-xml.xml added to file hash
Trying system delegate file:///etc/xml/docbook-xml.xml
Resolve URI http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
[...]
This is not normal. It looks like only the system idendifier (the URL) is used
while the public identifier (for which there's a match in /etc/xml/docbook-xml.xml)
is not used:
$ grep -- "-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" /etc/xml/docbook-xml.xml
<delegatePublic publicIdStartString="-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" catalog="file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml"/>
To confirm this impression I modified /etc/xml/docbook-xml.xml to replace this line:
<delegateSystem systemIdStartString="http://docbook.org/xml/4.5/docbookx.dtd" catalog="file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml"/>
With this one:
<delegateSystem systemIdStartString="http://docbook.org/xml/4.5/" catalog="file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml"/>
This allowed to go one step further in the catalog lookup:
Resolve: pubID -//OASIS//ENTITIES DocBook Character Entities V4.5//EN sysID http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
0 Parsing catalog file:///etc/xml/catalog
file:///etc/xml/catalog added to file hash
file:///etc/xml/docbook-xml.xml not found in file hash
0 Parsing catalog file:///etc/xml/docbook-xml.xml
file:///etc/xml/docbook-xml.xml added to file hash
Trying system delegate file:///etc/xml/docbook-xml.xml
file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml not found in file hash
0 Parsing catalog file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml added to file hash
Trying system delegate file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
Resolve URI http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
And to finally get it to work, I had to add this line in
/usr/share/xml/docbook/schema/dtd/4.5/catalog.xml:
<system systemId="http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod"
uri="dbcentx.mod"/>
Now I have this:
Resolve: pubID -//OASIS//ENTITIES DocBook Character Entities V4.5//EN sysID http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
0 Parsing catalog file:///etc/xml/catalog
file:///etc/xml/catalog added to file hash
file:///etc/xml/docbook-xml.xml not found in file hash
0 Parsing catalog file:///etc/xml/docbook-xml.xml
file:///etc/xml/docbook-xml.xml added to file hash
Trying system delegate file:///etc/xml/docbook-xml.xml
file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml not found in file hash
0 Parsing catalog file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml added to file hash
Trying system delegate file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
Found system match http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod, using file:///usr/share/xml/docbook/schema/dtd/4.5/dbcentx.mod
new input from file: file:///usr/share/xml/docbook/schema/dtd/4.5/dbcentx.mod
There's something fishy either in the catalog files, or in the logic of libxml2, I'm not
sure which one. Looking at
https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html#s.ext.resx
it looks like that the catalog file is at fault since libxml2 does the
right thing by trying to use the system identifier in the first place.
FWIW, I investigated this with the upstream author of Publican in this bugzilla ticket:
https://bugzilla.redhat.com/show_bug.cgi?id=1143060#c18
It's really blocking me to release the new version of Publican in Debian so it would
be nice to find a fix quickly if possible, because the freeze is approaching.
-- System Information:
Debian Release: jessie/sid
APT prefers squeeze-lts
APT policy: (500, 'squeeze-lts'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (500, 'oldstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 3.16-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=fr_FR.utf8, LC_CTYPE=fr_FR.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages docbook-xml depends on:
ii sgml-base 1.26+nmu4
ii sgml-data 2.0.9-1
ii xml-core 0.13+nmu2
docbook-xml recommends no packages.
Versions of packages docbook-xml suggests:
ii docbook 4.5-5.1
pn docbook-defguide <none>
ii docbook-dsssl 1.79-7
ii docbook-xsl 1.78.1+dfsg-1
-- no debconf information
-- debsums errors found:
debsums: changed file /usr/share/xml/docbook/schema/dtd/4.5/catalog.xml (from docbook-xml package)
--
Raphaël Hertzog ◈ Debian Developer
Discover the Debian Administrator's Handbook:
→ http://debian-handbook.info/get/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.xml
Type: application/xml
Size: 11096 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/debian-xml-sgml-pkgs/attachments/20141001/139105e2/attachment.xml>
More information about the debian-xml-sgml-pkgs
mailing list