Bug#1003810: libxml-libxml-perl: getElementById is broken (regression)

Vincent Lefevre vincent at vinc17.net
Sun Feb 6 21:17:32 GMT 2022


Control: retitle -1 libxml-libxml-perl: when XML::LibXML::Parser validation is set to 1, the DTD is not read

Hi,

On 2022-02-06 21:15:29 +0100, gregor herrmann wrote:
> What I see in the upstream Changes file are some behaviour changes,
> eg.
>     - Disable loading external DTDs or external entities by default
> maybe they are related to the problems you're experiencing?

Indeed, this is caused by this change: I've looked at the strace
output, and the DTD is read only with the old libxml-libxml-perl
version (2.0134+dfsg-2). But in my script, I do:

  my $parser = XML::LibXML->new();
  $parser->validation(1);
  my $movies = $parser->parse_file($mfile);

and since I request validation, the DTD should be read.

FYI, the XML::LibXML::Parser(3pm) man page says:

    validation
        /parser, reader/

        validate with the DTD; possible values are 0 and 1
        ^^^^^^^^^^^^^^^^^^^^^

So this is a major breakage for users who want validation.

This issue can be solved by adding

  $parser->load_ext_dtd(1);

But the XML::LibXML::Parser(3pm) man page still says:

  Creating a Parser Instance
[...]
    new
[...]
    [...] Unless specified otherwise, the options "load_ext_dtd", and
    "expand_entities" are set to 1. [...]
[...]
    load_ext_dtd
        /parser, reader/

        load the external DTD subset while parsing; possible values are 0
        and 1. Unless specified, XML::LibXML sets this option to 1.

So it looks like there is something inconsistent.

I've updated the bug title concerning my issue with validation set to 1,
but there's also this inconsistency with the man page.

BTW, I wonder whether this also breaks XML::LibXSLT, which needs the
DTD to be read.

Note also that if upstream really wants to change the default behavior
(e.g. without setting validation to 1 -- I've checked that the DTD was
loaded in this case with the old libxml-libxml-perl version), this is
also likely to break even more existing scripts, possibly in obscure
ways, as users were following a documented behavior.

I suggest to set the severity back to grave, at least until the
clarification of this mess.

If the final choice is likely to break existing scripts, this must
absolutely be announced in the NEWS.Debian file.

-- 
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



More information about the pkg-perl-maintainers mailing list