[debian-mysql] Bug#1095286: mariadb: FTBFS with pcre2 10.45 due to incorrect case-insensitive regex test

Matthew Vernon matthew at debian.org
Thu Feb 6 11:30:35 GMT 2025


Source: mariadb
Version: 1:11.4.4-3
Severity: serious
Justification: FTBFS, blocks pcre2 migration

Hi,

The mariadb REGEXP function (which RLIKE is an alias of) uses 
case-insensitive matching (except for binary strings)[0]

In perl, the Unicode character properties Uppercase_Letter, 
Lowercase_Letter, and Titlecase_Letter all match Cased_Letter under /i 
(case-insensitive matching)[1]

PCRE2 version 10.45 fixes its behaviour in this area to match Perl[2]

The Mariadb func_regexp_pcre.test tests currently check for the old, 
incorrect behaviour in the Unicode character class tests, meaning the CI 
tests fail[3]

For example, you can see that the test is expecting (line 3294 in CI 
output):
-\p{Ll}	A	0	

i.e. that \p{Ll} will not match on 'A'. But you can see that perl does 
match that:
matthew at aragorn:~$ perl -E "say 'matches' if 'A' =~ /\p{Ll}/i;"
matches

Likewise, you can see that the test is expecting (line 3321 in CI output):
-\p{Lu}	я	0

i.e. that \p{Lu} will not match on 'я'. Again, perl does:
matthew at aragorn:~$ perl -E "say 'matches' if 'я' =~ /\p{Lu}/i;"
matches

The correct answer is to either drop the case-sensitive tests here 
entirely (i.e. replace line 44 of func_regexp_pcre.tests with just
INSERT INTO t2 VALUES ('\\p{L}'),('\\p{L&}');) or to force the test data 
to be treated as binary (which I think is less useful), and then adjust 
the expected results to match.

Thanks,

Matthew

[0] https://mariadb.com/kb/en/regexp/
[1] https://perldoc.perl.org/perlunicode#Unicode-Character-Properties
[2] https://github.com/PCRE2Project/pcre2/blob/master/ChangeLog#L34C1-L37C30
[3] https://ci.debian.net/packages/m/mariadb/testing/amd64/57496769/#L3287



More information about the pkg-mysql-maint mailing list