[xml/sgml-pkgs] Bug#993638: Bug#993638: libxml2: XHTML 1.0 validation is broken

Vincent Lefevre vincent at vinc17.net
Mon Sep 20 16:32:04 BST 2021


On 2021-09-20 17:08:26 +0200, Thorsten Glaser wrote:
> On Mon, 20 Sep 2021, Vincent Lefevre wrote:
> 
> > Then libxml2 can find the right file on the local file system via
> > catalogs. In my case (which is the *default* setup with Debian
> 
> I never understood this catalogue thing. When I tried it, it didn’t
> work for me (that may admittedly have been multiple releases ago),
> the documentation was as good as Chinese to me, and… meh.

The catalog system was very buggy in the past. I had reported
many bugs in 2004. Things have much improved. The latest bugs
I found were in 2012.

> > Hmm... there seems to be a subtle difference in xhtml-special.ent:
> 
> Interesting.
> 
> I’m working with an XHTML 1.1 DTD, which has the entities inline
> (not sure if that was my doing or if I got it like this) and it
> too has:
> 
> <!-- C0 Controls and Basic Latin -->
> <!ENTITY quot    """> <!--  quotation mark, U+0022 ISOnum -->
> <!ENTITY amp     "&#38;"> <!--  ampersand, U+0026 ISOnum -->
> <!ENTITY lt      "&#60;"> <!--  less-than sign, U+003C ISOnum -->
> <!ENTITY gt      ">"> <!--  greater-than sign, U+003E ISOnum -->
> <!-- note: not specified in HTML 4 -->
> <!ENTITY apos    "'"> <!--  apostrophe = APL quote, U+0027 ISOnum -->

For the 1.1 DTD, w3c-dtd-xhtml 1.1-5 had the *upstream* file
xhtml-1.1/basic/xhtml-special.ent with the buggy entity definitions

<!ENTITY quot    """ ><!-- quotation mark = APL quote, U+0022 ISOnum -->
<!ENTITY amp     "&" ><!-- ampersand, U+0026 ISOnum -->
<!ENTITY lt      "<" ><!-- less-than sign, U+003C ISOnum -->
<!ENTITY gt      ">" ><!-- greater-than sign, U+003E ISOnum -->

In w3c-sgml-lib, the xhtml-special.ent file no longer depends on
the XHTML version, and it has correct definitions.

> But if this upstream change affects DTDs that were once released, maybe
> it should accept, but ignore, this specific wrong redeclaration.

Perhaps. This should probably be first talked with upstream.

> Though you said the bug was introduced in a Debian package only…
> where did the package get the wrong .ent files from?

See my other message: I suppose that Debian took the XHTML 1.1
version (which was buggy) to use it with both XHTML 1.0 and XHTML 1.1
DTDs. This is my only plausible explanation.

> If this is truly Debian-local, I agree nothing than the conflict is
> probably needed.

The XHTML 1.0 DTD issue seems Debian-local. But the XHTML 1.1 DTD
issue (which I have not tried) is an upstream one, according to the
w3c-dtd-xhtml_1.1.orig.tar.gz file, which is the upstream part I
got from https://snapshot.debian.org/package/w3c-dtd-xhtml/1.1-5/ .

-- 
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



More information about the debian-xml-sgml-pkgs mailing list