Bug#839600: perl: concatenating string instead of sprintf takes all computation power of server

Sun Oct 2 16:40:32 UTC 2016

Package: perl
Version: 5.20.2-3+deb8u6
Severity: normal

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

I am reading big string from file (2 megabytes), then add parentheses to
the beginning and end of that string, finally do parsing with regular
expressions using \G for continuation. 

>From time to time my server get locked, all computation power was
consumed by that perl script. 

I have asked question on perlmonks:
http://www.perlmonks.org/?node_id=1172994

People were testing program and didn't have the same effect. So it
appeared that this is problem only on my server. I have moved script to
another computer with the same version of Debian (jessie, stable), and
script still did run very slow. Problem dissapeared on new version
(stretch). I have tested on my ubuntu (run fast), and the same hardware
under VirtualBox with fresh installation of Debian Jessie -- script did
run very slow. 

So the problem exists on current debian stable version (jessie), and
doesn't exist on newer versions. 

Here is the test suite to reproduce error: 

https://drive.google.com/file/d/0B9GHvKh0yZKYY0pOUFc5V2hadTg/view?usp=sharing

If error is on your computer, then it runs about 30 seconds. 
If there is no error that it runs only half a second. 

Below there is a program. There are some lines commented out -- if you
uncomment them, then program runs very fast. Why? I don't understand.

###################################

use strict; 
use warnings; 

sub my_parse { 
	my ($a) = @_; 
	$$a =~ /\G\s*+\(/gc or die "paren expected"; 
	while ($$a =~ /\G\s*+([[:alnum:]_]+)\s*+/gc) { 
		if ($$a =~ /\G([-+._[:alnum:]]+|"(?:[^\\"]++|\\[\\"])*+")/gc) { 
			1; 
		} elsif ($$a =~ /\G(?=\()/gc) { 
			my_parse($a); 
		} else { 
			die "wrong value"; 
		}
	}
	$$a =~ /\G\s*+\)\s*+/gc or die "name expected"; 
	return; 
}

my $inp = `cat input.txt`; 

# my $inp = `echo; cat input.txt`; # is fast
# my $inp = `cat input.txt; echo`; # is fast
# my $inp = `sed -n '1,9000p' input.txt`; # is slow
# my $inp = `sed -n '1,8000p' input.txt`; # is fast

my $par = "(" . $inp . ")"; 
# my $par = sprintf "(%s)", $inp; # is fast

my_parse(\$par); 

###################################

The problem seems very strange for me. I have tried to make smaller data
input, but if I cut some lines from input script runs very fast, cut
lines from the beginning, or end, or in the middle. Adding empty line
with "echo" makes it run fast. Sprintf istead of concatenation makes it
fast. 

*** End of the template - remove these template lines ***

-- System Information:
Debian Release: 8.6
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 3.16.0-4-686-pae (SMP w/4 CPU cores)
Locale: LANG=pl_PL.UTF-8, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages perl depends on:
ii  dpkg          1.17.27
ii  libbz2-1.0    1.0.6-7+b3
ii  libc6         2.19-18+deb8u6
ii  libdb5.3      5.3.28-9
ii  libgdbm3      1.8.3-13.1
ii  perl-base     5.20.2-3+deb8u6
ii  perl-modules  5.20.2-3+deb8u6
ii  zlib1g        1:1.2.8.dfsg-2+b1

Versions of packages perl recommends:
ii  netbase  5.3
pn  rename   <none>

Versions of packages perl suggests:
pn  libterm-readline-gnu-perl | libterm-readline-perl-perl  <none>
pn  make                                                    <none>
pn  perl-doc                                                <none>

-- no debconf information