[xml/sgml-pkgs] Bug#993638: Bug#993638: libxml2: XHTML 1.0 validation is broken

Vincent Lefevre vincent at vinc17.net
Mon Sep 20 02:55:39 BST 2021


Control: retitle -1 libxml2: XHTML 1.0 validation is broken with w3c-dtd-xhtml's xhtml-special.ent file
Control: tags -1 - unreproducible

This should be reproducible with w3c-dtd-xhtml's xhtml-special.ent file.
The summary of the actual issue is below.

On 2021-09-20 03:18:46 +0200, Vincent Lefevre wrote:
[...]
> So the issue seems to occur when reading xhtml-special.ent.
> 
> Hmm... there seems to be a subtle difference in xhtml-special.ent:
> 
> With the file from w3c-dtd-xhtml:
> 
> <!ENTITY quot    """ ><!-- quotation mark = APL quote, U+0022 ISOnum -->
> <!ENTITY amp     "&" ><!-- ampersand, U+0026 ISOnum -->
> <!ENTITY lt      "<" ><!-- less-than sign, U+003C ISOnum -->
> <!ENTITY gt      ">" ><!-- greater-than sign, U+003E ISOnum -->
> 
> But with the file from w3c-sgml-lib:
> 
> <!ENTITY lt      "&#60;" ><!-- less-than sign, U+003C ISOnum -->
> <!ENTITY gt      ">" ><!-- greater-than sign, U+003E ISOnum -->
> <!ENTITY amp     "&#38;" ><!-- ampersand, U+0026 ISOnum -->
> <!ENTITY apos    "'" ><!-- The Apostrophe (Apostrophe Quote, APL Quote), U+0027 ISOnum -->
> <!ENTITY quot    """ ><!-- quotation mark (Quote Double), U+0022 ISOnum -->
> 
> The errors correspond to amp and lt.
> 
> Now, I don't know whether the new libxml2 version is too picky,
> or there was a real issue with the old entity files (ignored
> by all parsers until now?). In the latter case, I think that
> there should be a Breaks against w3c-dtd-xhtml.
> 
> One more thing: I've just checked on my Debian/stable machine,
> which just has w3c-sgml-lib installed:
> "xmllint --loaddtd --nonet --noout" works without any error.
> Thus there should be no issue by switching w3c-dtd-xhtml to
> w3c-sgml-lib.

FYI, the change of xhtml-special.ent upstream seems to be in

  https://github.com/w3c/markup-validator/commit/fa78ea2526fe20a89c90c4734f704fb0126186fd

(the diff output by git seems incorrect: one needs to browse the
files from the parent d1431fc to see the old version).

-- 
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



More information about the debian-xml-sgml-pkgs mailing list