Bug#787821: libhtml-parser-perl: encode_entities() convert chars to à instead of their proper entity

Mathieu ROY yeupou at gnu.org
Sat Jun 6 12:55:13 UTC 2015


Hi Gregor,

Le vendredi 5 juin 2015, 17:21:18 gregor herrmann a écrit :
 
> In this case I'd probably try with "use utf8::all;" or told open()
> 
> about the encoding:
> >   $ cat test.pl
> > 
> > #!/usr/bin/perl
> > use utf8;
> > use HTML::Entities;
> > 
> > open(INPUT, "< testdata");
> 
> open(my $fh,'<:encoding(utf8)', 'testdata');
> 
> (Untested.)

Tested, it works.

But then again, this can be done  this way only if we are 100% positive that input is always UTF-8 (which is 
not the case of my script - so I'm back to testing the input and it's still even easier to decode it).

I guess then apart from the missing --utf8 from pod2man there is no bug here and this report can be 
closed. 

Still, even though, as pointed out, I could have found the answer by checking general perl doc about 
encoding, maybe just a line in the HTML::Entities man about it could be useful.
Nowadays, you can expect input to be very often UTF-8.


-- 
http://yeupou.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-perl-maintainers/attachments/20150606/1f46af5a/attachment.html>


More information about the pkg-perl-maintainers mailing list