Bug#529305: \w doesn't match c-cedilla, o-diaeresis and u-diaeresis under tr_TR.utf8 and de_DE.utf8 locales

Damyan Ivanov dmn at debian.org
Tue May 19 07:19:31 UTC 2009


retitle 529305 "use locale;" breaks \w on matching c-cedilla, o-diaeresis and u-diaeresis under tr_TR.utf8 and de_DE.utf8 locales
found 529305 5.10.0-19
found 529305 5.8.8-7etch6
thanks

-=| Damyan Ivanov, Mon, May 18, 2009 at 05:52:09PM +0300 |=-
> Showcase:
> (requires installing tr_TR.utf8 and de_De.utf8 locales via 'dpkg-reconfigure
> locales' or installing locales-all package)
> 
>  #/usr/bin/perl
>  use strict;
>  use warnings;
>  use POSIX qw(setlocale LC_ALL);
>  setlocale(LC_ALL, "tr_TR.utf8");
>  print "Locale is ", setlocale(LC_ALL), "\n";
> 
>  use locale;
>  use utf8;
>  binmode STDOUT, ":utf8";
> 
>  print "$_ is " . ( /\w/ ? "" : "not " ) . "a word character\n"
>     for qw( ç ö ş ü ğ ı İ );
> 
> The output is
> 
>  Locale is tr_TR.utf8
>  ç is not a word character
>  ö is not a word character
>  ş is a word character
>  ü is not a word character
>  ğ is a word character
>  ı is a word character
>  İ is a word character

The thing to trigger the bug seems to be "use locale;". Drop that, and 
everything works.

-- 
dam






More information about the Perl-maintainers mailing list