Bug#787821: libhtml-parser-perl: encode_entities() convert chars to à instead of their proper entity
Mathieu ROY
yeupou at gnu.org
Sat Jun 6 12:55:13 UTC 2015
Hi Gregor,
Le vendredi 5 juin 2015, 17:21:18 gregor herrmann a écrit :
> In this case I'd probably try with "use utf8::all;" or told open()
>
> about the encoding:
> > $ cat test.pl
> >
> > #!/usr/bin/perl
> > use utf8;
> > use HTML::Entities;
> >
> > open(INPUT, "< testdata");
>
> open(my $fh,'<:encoding(utf8)', 'testdata');
>
> (Untested.)
Tested, it works.
But then again, this can be done this way only if we are 100% positive that input is always UTF-8 (which is
not the case of my script - so I'm back to testing the input and it's still even easier to decode it).
I guess then apart from the missing --utf8 from pod2man there is no bug here and this report can be
closed.
Still, even though, as pointed out, I could have found the answer by checking general perl doc about
encoding, maybe just a line in the HTML::Entities man about it could be useful.
Nowadays, you can expect input to be very often UTF-8.
--
http://yeupou.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-perl-maintainers/attachments/20150606/1f46af5a/attachment.html>
More information about the pkg-perl-maintainers
mailing list