[xml/sgml-pkgs] Bug#993638: Bug#993638: libxml2: XHTML 1.0 validation is broken
Vincent Lefevre
vincent at vinc17.net
Mon Sep 20 16:32:04 BST 2021
On 2021-09-20 17:08:26 +0200, Thorsten Glaser wrote:
> On Mon, 20 Sep 2021, Vincent Lefevre wrote:
>
> > Then libxml2 can find the right file on the local file system via
> > catalogs. In my case (which is the *default* setup with Debian
>
> I never understood this catalogue thing. When I tried it, it didn’t
> work for me (that may admittedly have been multiple releases ago),
> the documentation was as good as Chinese to me, and… meh.
The catalog system was very buggy in the past. I had reported
many bugs in 2004. Things have much improved. The latest bugs
I found were in 2012.
> > Hmm... there seems to be a subtle difference in xhtml-special.ent:
>
> Interesting.
>
> I’m working with an XHTML 1.1 DTD, which has the entities inline
> (not sure if that was my doing or if I got it like this) and it
> too has:
>
> <!-- C0 Controls and Basic Latin -->
> <!ENTITY quot """> <!-- quotation mark, U+0022 ISOnum -->
> <!ENTITY amp "&"> <!-- ampersand, U+0026 ISOnum -->
> <!ENTITY lt "<"> <!-- less-than sign, U+003C ISOnum -->
> <!ENTITY gt ">"> <!-- greater-than sign, U+003E ISOnum -->
> <!-- note: not specified in HTML 4 -->
> <!ENTITY apos "'"> <!-- apostrophe = APL quote, U+0027 ISOnum -->
For the 1.1 DTD, w3c-dtd-xhtml 1.1-5 had the *upstream* file
xhtml-1.1/basic/xhtml-special.ent with the buggy entity definitions
<!ENTITY quot """ ><!-- quotation mark = APL quote, U+0022 ISOnum -->
<!ENTITY amp "&" ><!-- ampersand, U+0026 ISOnum -->
<!ENTITY lt "<" ><!-- less-than sign, U+003C ISOnum -->
<!ENTITY gt ">" ><!-- greater-than sign, U+003E ISOnum -->
In w3c-sgml-lib, the xhtml-special.ent file no longer depends on
the XHTML version, and it has correct definitions.
> But if this upstream change affects DTDs that were once released, maybe
> it should accept, but ignore, this specific wrong redeclaration.
Perhaps. This should probably be first talked with upstream.
> Though you said the bug was introduced in a Debian package only…
> where did the package get the wrong .ent files from?
See my other message: I suppose that Debian took the XHTML 1.1
version (which was buggy) to use it with both XHTML 1.0 and XHTML 1.1
DTDs. This is my only plausible explanation.
> If this is truly Debian-local, I agree nothing than the conflict is
> probably needed.
The XHTML 1.0 DTD issue seems Debian-local. But the XHTML 1.1 DTD
issue (which I have not tried) is an upstream one, according to the
w3c-dtd-xhtml_1.1.orig.tar.gz file, which is the upstream part I
got from https://snapshot.debian.org/package/w3c-dtd-xhtml/1.1-5/ .
--
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
More information about the debian-xml-sgml-pkgs
mailing list