Bug#711448: libhtml-copy-perl: FTBFS with perl 5.18: test failure
Dominic Hargreaves
dom at earth.li
Tue Jun 18 20:08:43 UTC 2013
On Mon, Jun 17, 2013 at 08:35:28PM +0200, gregor herrmann wrote:
> Control: tag -1 + patch
>
> On Thu, 06 Jun 2013 22:45:23 +0100, Dominic Hargreaves wrote:
>
> > Strings with code points over 0xFF may not be mapped into in-memory file handles
> > readline() on closed filehandle $in at /build/dom-libhtml-copy-perl_1.30-1-i386-
> > fEvCSD/libhtml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 255.
> > Use of uninitialized value in subroutine entry at /build/dom-libhtml-copy-perl_1
> > .30-1-i386-fEvCSD/libhtml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 258.
> > Use of uninitialized value in concatenation (.) or string at /build/dom-libhtml-
> > copy-perl_1.30-1-i386-fEvCSD/libhtml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 2
> > 76.
> > Can't guess encoding of at /build/dom-libhtml-copy-perl_1.30-1-i386-fEvCSD/libh
> > tml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 276.
> > # Looks like you planned 16 tests but ran 6.
> > # Looks like your test exited with 255 just after 6.
> > t/parse.t ....
> > Dubious, test returned 255 (wstat 65280, 0xff00)
> > Failed 1/2 test programs. 0/7 subtests failed.
> > Failed 10/16 subtests
>
> "Strings with code points over 0xFF may not be mapped into in-memory file handles"
> happens t/parse.t, line 181:
> open my $in, "<", \$src_html_utf8;
> (where $src_html_utf8 contains HTML with some nice characters (ああ) in
> it).
>
> perldiag says:
>
> Strings with code points over 0xFF may not be mapped into in-memory file handles
>
> (W utf8) You tried to open a reference to a scalar for read or
> append where the scalar contained code points over 0xFF.
> In-memory files model on-disk files and can only contain bytes.
>
>
> Some searching indicates that strategically dropping some
> encode_utf8() in the code might help ... Let's try ... Ok, here we are:
>
> #v+
> diff --git a/t/parse.t b/t/parse.t
> index 1550268..15eb8c6 100644
> --- a/t/parse.t
> +++ b/t/parse.t
> @@ -6,6 +6,7 @@ use HTML::Copy;
> use utf8;
> use File::Spec::Functions;
> #use Data::Dumper;
> +use Encode qw(encode_utf8 decode_utf8);
>
> use Test::More tests => 16;
>
> @@ -109,7 +110,7 @@ $copy_html = do {
> ok($copy_html eq $result_html_nocharset, "copy_to no charset shift_jis");
>
> ##== HTML with charset uft-8
> -my $src_html_utf8 = <<EOT;
> +my $src_html_utf8 = encode_utf8(<<EOT);
> <!DOCTYPE html>
> <html>
> <head>
> @@ -126,7 +127,7 @@ my $src_html_utf8 = <<EOT;
> </html>
> EOT
>
> -my $result_html_utf8 = <<EOT;
> +my $result_html_utf8 = encode_utf8(<<EOT);
> <!DOCTYPE html>
> <html>
> <head>
> @@ -174,7 +175,7 @@ $copy_html = do {
> read_and_unlink($destination, $p);
> };
>
> -ok($copy_html eq $result_html_utf8, "copy_to giviing a file handle");
> +ok($copy_html eq decode_utf8($result_html_utf8), "copy_to giviing a file handle");
>
> ##=== copy_to gving file handles for input and output
> $copy_html = do {
> @@ -187,7 +188,7 @@ $copy_html = do {
> Encode::decode($p->encoding, $outdata);
> };
>
> -ok($copy_html eq $result_html_utf8, "copy_to giviing file handles for input and output");
> +ok($copy_html eq decode_utf8($result_html_utf8), "copy_to giviing file handles for input and output");
>
> ##=== parse_to giving a file handle
> $copy_html = do {
> @@ -196,7 +197,7 @@ $copy_html = do {
> $p->parse_to($destination);
> };
>
> -ok($copy_html eq $result_html_utf8, "copy_to giviing file handles for input and output");
> +ok($copy_html eq decode_utf8($result_html_utf8), "copy_to giviing file handles for input and output");
>
> ##=== copy_to with directory destination
> $copy_html = do {
> #v-
>
>
> I'm committing this now but some sanity check would be appreciated.
At a glance, this seems sane, but I guess upstream should be given a
chance to comment too (whether before or after you upload the fix
to Debian).
Cheers,
Dominic.
More information about the pkg-perl-maintainers
mailing list