[xml/sgml-pkgs] Bug#993638: Bug#993638: libxml2: XHTML 1.0 validation is broken
Mattia Rizzolo
mattia at debian.org
Sun Sep 19 21:59:31 BST 2021
On Sun, Sep 19, 2021 at 09:45:19PM +0200, Vincent Lefevre wrote:
> On 2021-09-19 19:15:54 +0200, Mattia Rizzolo wrote:
> > I can never manage to download DTDs from w3.org (how could you?!), so,
> > taking your testcase and a copy of the same DTD:
>
> The DTD is provided by Debian, no need to download it.
But you need to instruct xmllint to use said DTD, it won't by its own
decision to pick a random DTD from the filesystem. I also know how to
use apt-file myself:
| % apt-file search xhtml1-strict.dtd
| dita-ot: /usr/share/dita-ot/demo/h2d/dtd/xhtml1-strict.dtd
| erlang-erl-docgen: /usr/lib/erlang/lib/erl_docgen-1.1.1/priv/dtd/xhtml1-strict.dtd
| kate5-data: /usr/share/katexmltools/xhtml1-strict.dtd.xml
| libpxp-ocaml-dev: /usr/share/doc/libpxp-ocaml-dev/examples/namespaces/xhtml1-strict.dtd.gz
| librdf-rdfa-parser-perl: /usr/share/perl5/auto/share/dist/RDF-RDFa-Parser/catalogue/www.w3.org/MarkUp/DTD/xhtml1-strict.dtd
| w3-recs: /usr/share/doc/w3-recs/html/www.w3.org/TR/2002/REC-xhtml1-20020801/DTD/xhtml1-strict.dtd.gz
| w3c-sgml-lib: /usr/share/xml/w3c-sgml-lib/schema/dtd/REC-xhtml1-20020801/xhtml1-strict.dtd
| xemacs21-basesupport: /usr/share/xemacs21/xemacs-packages/etc/psgml-dtds/xhtml1-strict.dtd
| xmlcopyeditor: /usr/share/xmlcopyeditor/dtd/xhtml1-strict.dtd
| %
indeed the one I used is the one from xmlcopyeditor (I picked a random
package, trusting that said .dtd is actually the same as all of the
above).
> > mattia at warren /tmp/tmp/xml % xmllint --dtdvalid xhtml1-strict.dtd --nonet --noout test.html
> > I/O error : Attempt to load network entity http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> > test.html:2: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
> > C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
> > ^
> > mattia at warren /tmp/tmp/xml %
> >
> > which looks good to me.
>
> An I/O error is not good. Your system appears to be broken.
My system is fine. That error message is only a red herring due to
--nonet, and indeed the return code of xmllint is 0.
If you prefer, I can modify the DOCTYPE and do this instead, so there
won't be "I/O error"s and the return code is clear:
mattia at warren /tmp/tmp/xml % cat test.html
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "file:///tmp/tmp/xml/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>title</title></head>
<body><p>text</p></body>
</html>
mattia at warren /tmp/tmp/xml % xmllint --noout --nonet test.html ; echo $?
0
mattia at warren /tmp/tmp/xml % dpkg -l libxml2|tail -n1
ii libxml2:amd64 2.9.12+dfsg-4 amd64 GNOME XML library
mattia at warren /tmp/tmp/xml %
--
regards,
Mattia Rizzolo
GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`.
More about me: https://mapreri.org : :' :
Launchpad user: https://launchpad.net/~mapreri `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia `-
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/debian-xml-sgml-pkgs/attachments/20210919/d402d595/attachment-0001.sig>
More information about the debian-xml-sgml-pkgs
mailing list