Bug#1056917: bookworm-pu: package perl/5.36.0-7+deb12u1
Niko Tyni
ntyni at debian.org
Sun Nov 26 17:16:20 GMT 2023
Package: release.debian.org
Severity: normal
Tags: bookworm
User: release.debian.org at packages.debian.org
Usertags: pu
X-Debbugs-Cc: perl at packages.debian.org, Salvatore Bonaccorso <carnil at debian.org>
Control: affects -1 + src:perl
[ Reason ]
I'd like to fix #1056746 / CVE-2023-47038 in perl for bookworm. It's a
non-DSA security issue that was made public yesterday and fixed upstream
in 5.36.2.
[ Impact ]
CVE-2023-47038 has security impact for applications that use untrusted
regular expressions to match input.
[ Tests ]
The fix augments the test suite to check for this issue. I have also
checked manually that the crash is gone with the patch. I reviewed amd64
binary debdiffs too and did some installation tests.
[ Risks ]
The fix is minimal and identical to the one in sid / 5.36.0-10. I don't
expect any fallout, but obviously I'll report here if any problems are
found in the testing migration checks.
[ Checklist ]
[X] *all* changes are documented in the d/changelog
[X] I reviewed all changes and I approve them
[X] attach debdiff against the package in (old)stable
[X] the issue is verified as fixed in unstable
[ Changes ]
The only change is a patch to the regexp engine in regcomp.c
and the associated new tests. The patch description has
a long explanation of the issue.
[ Other info ]
I'm uploading right away as I don't expect any of this to be
controversial. Hope that's fine by you.
Thanks for your work on Debian.
-------------- next part --------------
diff -Nru perl-5.36.0/debian/changelog perl-5.36.0/debian/changelog
--- perl-5.36.0/debian/changelog 2023-01-08 23:28:47.000000000 +0200
+++ perl-5.36.0/debian/changelog 2023-11-25 22:59:54.000000000 +0200
@@ -1,3 +1,10 @@
+perl (5.36.0-7+deb12u1) bookworm; urgency=medium
+
+ * [SECURITY] CVE-2023-47038: Write past buffer end via illegal
+ user-defined Unicode property. (Closes: #1056746)
+
+ -- Niko Tyni <ntyni at debian.org> Sat, 25 Nov 2023 22:59:54 +0200
+
perl (5.36.0-7) unstable; urgency=medium
* Break backuppc (<< 4.4.0-7~) due to Data::Dumper changes in 5.36
diff -Nru perl-5.36.0/debian/patches/fixes/CVE-2023-47038.diff perl-5.36.0/debian/patches/fixes/CVE-2023-47038.diff
--- perl-5.36.0/debian/patches/fixes/CVE-2023-47038.diff 1970-01-01 02:00:00.000000000 +0200
+++ perl-5.36.0/debian/patches/fixes/CVE-2023-47038.diff 2023-11-25 22:59:54.000000000 +0200
@@ -0,0 +1,119 @@
+From: Karl Williamson <khw at cpan.org>
+Date: Sat, 9 Sep 2023 11:59:09 -0600
+Subject: Fix read/write past buffer end: perl-security#140
+
+A package name may be specified in a \p{...} regular expression
+construct. If unspecified, "utf8::" is assumed, which is the package
+all official Unicode properties are in. By specifying a different
+package, one can create a user-defined property with the same
+unqualified name as a Unicode one. Such a property is defined by a sub
+whose name begins with "Is" or "In", and if the sub wishes to refer to
+an official Unicode property, it must explicitly specify the "utf8::".
+S_parse_uniprop_string() is used to parse the interior of both \p{} and
+the user-defined sub lines.
+
+In S_parse_uniprop_string(), it parses the input "name" parameter,
+creating a modified copy, "lookup_name", malloc'ed with the same size as
+"name". The modifications are essentially to create a canonicalized
+version of the input, with such things as extraneous white-space
+stripped off. I found it convenient to strip off the package specifier
+"utf8::". To to so, the code simply pretends "lookup_name" begins just
+after the "utf8::", and adjusts various other values to compensate.
+However, it missed the adjustment of one required one.
+
+This is only a problem when the property name begins with "perl" and
+isn't "perlspace" nor "perlword". All such ones are undocumented
+internal properties.
+
+What happens in this case is that the input is reparsed with slightly
+different rules in effect as to what is legal versus illegal. The
+problem is that "lookup_name" no longer is pointing to its initial
+value, but "name" is. Thus the space allocated for filling "lookup_name"
+is now shorter than "name", and as this shortened "lookup_name" is
+filled by copying suitable portions of "name", the write can be to
+unallocated space.
+
+The solution is to skip the "utf8::" when reparsing "name". Then both
+"lookup_name" and "name" are effectively shortened by the same amount,
+and there is no going off the end.
+
+This commit also does white-space adjustment so that things align
+vertically for readability.
+
+This can be easily backported to earlier Perl releases.
+
+Bug-Debian: https://bugs.debian.org/1056746
+Origin: backport, https://github.com/Perl/perl5/commit/7047915eef37fccd93e7cd985c29fe6be54650b6
+---
+ regcomp.c | 17 +++++++++++------
+ t/re/pat_advanced.t | 8 ++++++++
+ 2 files changed, 19 insertions(+), 6 deletions(-)
+
+diff --git a/regcomp.c b/regcomp.c
+index 4051333..9c0338c 100644
+--- a/regcomp.c
++++ b/regcomp.c
+@@ -24178,7 +24178,7 @@ S_parse_uniprop_string(pTHX_
+ * compile perl to know about them) */
+ bool is_nv_type = FALSE;
+
+- unsigned int i, j = 0;
++ unsigned int i = 0, i_zero = 0, j = 0;
+ int equals_pos = -1; /* Where the '=' is found, or negative if none */
+ int slash_pos = -1; /* Where the '/' is found, or negative if none */
+ int table_index = 0; /* The entry number for this property in the table
+@@ -24312,9 +24312,13 @@ S_parse_uniprop_string(pTHX_
+ * all of them are considered to be for that package. For the purposes of
+ * parsing the rest of the property, strip it off */
+ if (non_pkg_begin == STRLENs("utf8::") && memBEGINPs(name, name_len, "utf8::")) {
+- lookup_name += STRLENs("utf8::");
+- j -= STRLENs("utf8::");
+- equals_pos -= STRLENs("utf8::");
++ lookup_name += STRLENs("utf8::");
++ j -= STRLENs("utf8::");
++ equals_pos -= STRLENs("utf8::");
++ i_zero = STRLENs("utf8::"); /* When resetting 'i' to reparse
++ from the beginning, it has to be
++ set past what we're stripping
++ off */
+ stripped_utf8_pkg = TRUE;
+ }
+
+@@ -24728,7 +24732,8 @@ S_parse_uniprop_string(pTHX_
+
+ /* We set the inputs back to 0 and the code below will reparse,
+ * using strict */
+- i = j = 0;
++ i = i_zero;
++ j = 0;
+ }
+ }
+
+@@ -24749,7 +24754,7 @@ S_parse_uniprop_string(pTHX_
+ * separates two digits */
+ if (cur == '_') {
+ if ( stricter
+- && ( i == 0 || (int) i == equals_pos || i == name_len- 1
++ && ( i == i_zero || (int) i == equals_pos || i == name_len- 1
+ || ! isDIGIT_A(name[i-1]) || ! isDIGIT_A(name[i+1])))
+ {
+ lookup_name[j++] = '_';
+diff --git a/t/re/pat_advanced.t b/t/re/pat_advanced.t
+index 2a25411..088efed 100644
+--- a/t/re/pat_advanced.t
++++ b/t/re/pat_advanced.t
+@@ -2688,6 +2688,14 @@ EOF_DEBUG_OUT
+ {}, "Related to Github Issue #19350, forward \\g{x} pattern segv under use re Debug => 'PARSE'");
+ }
+
++ { # perl-security#140, read/write past buffer end
++ fresh_perl_like('qr/\p{utf8::perl x}/',
++ qr/Illegal user-defined property name "utf8::perl x" in regex/,
++ {}, "perl-security#140");
++ fresh_perl_is('qr/\p{utf8::_perl_surrogate}/', "",
++ {}, "perl-security#140");
++ }
++
+
+ # !!! NOTE that tests that aren't at all likely to crash perl should go
+ # a ways above, above these last ones. There's a comment there that, like
diff -Nru perl-5.36.0/debian/patches/series perl-5.36.0/debian/patches/series
--- perl-5.36.0/debian/patches/series 2022-12-03 20:56:48.000000000 +0200
+++ perl-5.36.0/debian/patches/series 2023-11-25 22:59:54.000000000 +0200
@@ -50,3 +50,4 @@
fixes/readline-stream-errors.diff
fixes/readline-stream-errors-test.diff
fixes/lto-test-fix.diff
+fixes/CVE-2023-47038.diff
More information about the Perl-maintainers
mailing list