Bug#864782: perl: Regexp matching crashes claiming string is malformed Utf8, despite it is valid.

Benjamin Bayart bayartb at edgard.fdn.fr
Wed Jun 14 17:16:35 UTC 2017


Package: perl
Version: 5.24.1-3
Severity: normal
Tags: upstream

Dear Maintainer,


In some cases, some valid utf-8 chinese (or japanese Kanji) chars
in a perl string makes perl die on "Malformed UTF-8" while matching
a regexp.

Here is the smallest programm (all in ascii, for safety) creating
the problem.


#!/usr/bin/perl

use strict;
use warnings;

my $text = "[quant,_1,\x{55b6}\x{696d}\x{65e5},\x{55b6}\x{696d}\x{65e5}]\x{6bce}";

eval {$text =~ s{((?<!~)(?:~~)*)\[([A-Za-z#*]\w*)(?:,([^\]]+))?\]}{"$1%$2($3)"}eg;   };

if ( $@ ) {
	die "Failed $@";
} else {
	print "Works, for now\n";
}


The very same text, on the very same regexp, did not create problems on the
previous (5.20.*, 5.22.*) versions of perl. We use that text, and that
regexp, in production environment, using Debian stable, and everything is
running fine.

Beware : it *often* crashes. Not always. When you add entropy, you can get
down to a crash out of two run of the program (use-ing some more stuff,
printing, etc). This precise test script seems to crash everytime I use
it on my developpement environment.

Regards,

	Benjamin.




-- System Information:
Debian Release: 9.0
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.12-1-amd64 (SMP w/3 CPU cores)
Locale: LANG=en_US.ISO-8859-15, LC_CTYPE=en_US.ISO-8859-15 (charmap=ISO-8859-15), LANGUAGE=en_US.ISO-8859-15 (charmap=ISO-8859-15)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages perl depends on:
ii  dpkg               1.18.24
ii  libperl5.24        5.24.1-3
ii  perl-base          5.24.1-3
ii  perl-modules-5.24  5.24.1-3

Versions of packages perl recommends:
ii  netbase  5.4
ii  rename   0.20-4

Versions of packages perl suggests:
ii  libterm-readline-gnu-perl   1.35-1
ii  libterm-readline-perl-perl  1.0303-1
ii  make                        4.1-9.1
ii  perl-doc                    5.24.1-3

-- no debconf information




More information about the Perl-maintainers mailing list