Bug#318058: works fine under libdbd-mysql-perl 4.008-1 in unstable

Alex Muntada alexm at alexm.org
Fri Apr 10 13:02:15 UTC 2009


* WK <wk at hot.ee>:

> I think, the problem is here. I use full utf8 environment,
> so all those variables are UTF8 in my case. As far i
> understand, they should be.

I installed mysql-server on sid without changing a single
file, so the latin1 values seem the default to me.

> So, i think you can now reproduce my test, if you change
> those to UTF8. Unless you say, mysql should not run fully
> UTF8.

Not for me to say if mysql should run at full UTF8 power ;-)
Anyway, I made the following changes in order to get it on
UTF8 steroids:

[client]
default-character-set=utf8

[mysqld]
character-set-server=utf8
collation-server=utf8_estonian_ci

After verifying that now all server and client have utf8 in \s,
I dropped the utf8 database, created again and performed
the test again, which then shows this:

1 ��

> BTW, what will output such commandline to you (while
> you use Latin1 for mysql):
>
> echo "select * from utf8.utf8" | mysql

id	string
1	šš ðð

> I got correct answer with UTF8 in power.

After rebuilding the database from scratch, me too:

id	string
1	šš ðð

> I tried some other UTF8 locales too, en_GB.UTF-8 for
> example did not help. I think my locale is properly set up.

I finally found the problem: DBD::mysql does not work on
UTF8 by default, please search for «mysql_enable_utf8»
in perldoc. It doesn't matter that the input strings and the
database are both in UTF8. You have to explicitly enable
UTF8 like this:

my $dbh = DBI->connect(
    $data_source,
    $user,
    $password,
    { mysql_enable_utf8 => 1 },
) || die "no connection\n";

Alternatively, you can also use this:

$dbh->do("SET NAMES utf8");

To make sure that UTF8 is enabled or not, you can use
the data_string_desc($string) funtion from DBI. I added
a few lines to your test.pl at the end:

use Data::Dump 'pp';
pp @data;
print data_string_desc($data[1]), "\n";

And these are the results (it's weird that setting the
utf8 attribute or using NAMES gives different results):

== no mysql_enable_utf8 and no SET NAMES utf8:

1 ��
(1, "\x9A\x9A \xF0\xF0")
UTF8 off, non-ASCII, 5 characters 5 bytes

== no mysql_enable_utf8 but SET NAMES utf8:

1 šš ðð
(1, "\xC5\xA1\xC5\xA1 \xC3\xB0\xC3\xB0")
UTF8 off, non-ASCII, 9 characters 9 bytes

== mysql_enable_utf8 => 1:

Wide character in print at test.pl line 24.
1 šš ðð
(1, "\x{161}\x{161} \xF0\xF0")
UTF8 on, non-ASCII, 5 characters 9 bytes

To get rid of that warning you should uncomment
the line with binmode STDOUT, ":utf8";

Can you please confirm that this works the same
for you? If it does, I think it will indicate that this
bug could be resolved.

Hope that helps!

-- 
Alex Muntada <alexm at alexm.org>
http://alexm.org/





More information about the pkg-perl-maintainers mailing list