Bug#586045: File::Find no_chdir misbehaves when $dir is_utf8

Joey Hess joeyh at debian.org
Tue Jun 15 22:30:15 UTC 2010


Package: perl-modules
Version: 5.10.1-13
Severity: normal

File::Find has a problem in no_chdir mode when the directory
it is run on is utf-8, and has a utf-8 filename inside. 

Test program:

joey at gnu:~>cat test
use utf8;
use Encode;
use File::Find;
my $dir=shift;
if (shift) {
	$dir=decode_utf8($dir);
}
print Encode::is_utf8($dir)."\n";
find({ 
	wanted => sub {
		my $f=decode_utf8($_);
		$f=~s/ü/mango/g;
		print "$f\n";
	},
	no_chdir => 1}, 
$dir)

Here it works as expected; the wanted function is able to decode_utf8($_)
and operate on individual unicode characters:

joey at gnu:~>find fooü      
fooü
fooü/ü
joey at gnu:~>perl test fooü  

foomango
foomango/mango

But what if the directory passed to File::Find has the utf8 flag set?

joey at gnu:~>perl test fooü 1
1
foomango
foomango/ü

What's going on is that Find::Find concacenates together the $dir, which
has the flag set, with a filename, which has not been decoded from utf8.
The resulting string has the utf8 flag set, so when the wanted function
runs decode_utf8 on it, nothing is done, and it remains partially utf8
encoded, and partially not.

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.32-5-686 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages perl-modules depends on:
ii  perl                          5.10.1-13  Larry Wall's Practical Extraction 

perl-modules recommends no packages.

perl-modules suggests no packages.

-- no debconf information

-- 
see shy jo loves mangos
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/perl-maintainers/attachments/20100615/42350630/attachment.pgp>


More information about the Perl-maintainers mailing list