[xml/sgml-pkgs] Bug#574104: Bug#574104: libxml2: considers null bytes as EOF markers

Jakub Wilk jwilk at debian.org
Tue Mar 16 11:37:05 UTC 2010

* Mike Hommey <mh at glandium.org>, 2010-03-16, 12:23:
>> libxml2 ignores null bytes (and following bytes) in an XML file:
>> $ printf '<test/>\0junk' | xmlwf
>> STDIN:1:7: not well-formed (invalid token)
>> $ printf '<test/>\0junk' | xmllint -
>> <?xml version="1.0"?>
>> <test/>
>For a starter, libxml2 treats your data as UTF-8, and as such uses null
>terminated strings, so this is not an unexpected behaviour.

Huh? Why should I care about such implementation details? I care about 
behaviour, which is broken. (Anyway, UTF-8 and null-terminated string 
are *unrelated* concepts.)

>Secondly, the null character is not allowed in a xml file.

That's my point. It is not allowed, yet xmllint happily accept files 
containing it as well-formed.

Jakub Wilk
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/debian-xml-sgml-pkgs/attachments/20100316/680970e0/attachment.pgp>

More information about the debian-xml-sgml-pkgs mailing list