[Python-modules-team] Bug#541198: python-mysqldb: utf8_bin collation will not convert to Unicode strings

Brian May bam at debian.org
Wed Nov 4 09:41:29 UTC 2015


On Wed, Aug 12, 2009 at 02:01:35PM +0200, Christoph Burgmer wrote:
> A string type column with a utf8_bin collation will not be converted to a
> Python Unicode string, but instead will be returned as a utf8 (byte) string.
> 
> The MySQL documentation though clearly states: "A nonbinary string has a
> character set and is converted to another character set in many cases, even
> when the string has a _bin collation"[1].
> 
> I understand that a string with utf8_bin collation is still a string and
> thus should not be dealt with differently. The utf8_bin collation is
> essential when working with Unicode without wanting the Unicode collation
> algorithm to kick in.
> 
> How to reproduce:
> 
> CREATE TABLE t1 (
>     a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
> );
> 
> INSERT INTO t1 VALUES ('ü');
> 
> In Python:
> >>> import MySQLdb
> >>> db = MySQLdb.connect(db='pymysqltest', charset='utf8', use_unicode=True)
> >>> cur = db.cursor()
> >>> cur.execute("SELECT a FROM t1;")
> 1L
> >>> cur.fetchall()
> (('\xc3\xbc',),)
> 
> Chosing utf8_general_ci instead of utf8_bin will properly yield Unicode
> objects:
> 
> >>> cur.execute("SELECT a COLLATE utf8_general_ci FROM t1;")
> 1L
> >>> cur.fetchall()
> ((u'\xfc',),)
> 
> [1] http://dev.mysql.com/doc/refman/5.1/en/charset-binary-collations.html

On Sat, Dec 10, 2011 at 02:50:27PM +0100, Philipp Spitzer wrote:
> It is still present in upstream python-mysqldb 1.2.3.

Is this bug still present in the python-mysqldb package in unstable?
Version 1.3.6-1 based on the mysqlclient fork.
-- 
Brian May <bam at debian.org>



More information about the Python-modules-team mailing list