[Raspbian-devel] [PATCH] debian, raspbian, openssl and performance

Yuriy M. Kaminskiy yumkam+debian at gmail.com
Sat Jan 12 16:16:31 GMT 2019


I recently bought Raspberry Pi 3B+, run `openssl speed`, and noticed 
that performance was not what I expected it to - some time ago I looked 
at openssl sources, and remembered that it had some hand-optimized neon 
assembler, with bitsliced AES, chacha20, etc, and it looks like 
raspbian's openssl somehow was not using it.

When I installed openssl from debian/stretch/armhf, I got "correct" 
performance (up to 2x better).

Now, what's wrong, and how it can be solved.

openssl package in raspbian seems was simply recompiled from debian 
source package, without source changes. Only thing that is different - 
default compiler flags.

Debian package uses debian-armhf configuration:
debian/rules:
...
	../Configure shared $(CONFARGS) debian-$(DEB_HOST_ARCH)-$$opt;
...
it is defined in Configurations/20-debian.conf:
...
	"debian-armhf" => {
		inherit_from => [ "linux-armv4", "debian" ],
	},
...
And linux-armv4 is defined in Configurations/10-main.conf:
...
     "linux-armv4" => {
         ################################################################
         # Note that -march is not among compiler options in linux-armv4
         # target description. Not specifying one is intentional to give
         # you choice to:
         #
         # a) rely on your compiler default by not specifying one;
         # b) specify your target platform explicitly for optimal
         # performance, e.g. -march=armv6 or -march=armv7-a;
         # c) build "universal" binary that targets *range* of platforms
         # by specifying minimum and maximum supported architecture;
         #
         # As for c) option. It actually makes no sense to specify
         # maximum to be less than ARMv7, because it's the least
         # requirement for run-time switch between platform-specific
         # code paths. And without run-time switch performance would be
         # equivalent to one for minimum. Secondly, there are some
         # natural limitations that you'd have to accept and respect.
         # Most notably you can *not* build "universal" binary for
         # big-endian platform. This is because ARMv7 processor always
         # picks instructions in little-endian order. Another similar
         # limitation is that -mthumb can't "cross" -march=armv6t2
         # boundary, because that's where it became Thumb-2. Well, this
         # limitation is a bit artificial, because it's not really
         # impossible, but it's deemed too tricky to support. And of
         # course you have to be sure that your binutils are actually
         # up to the task of handling maximum target platform. With all
         # this in mind here is an example of how to configure
         # "universal" build:
         #
         # ./Configure linux-armv4 -march=armv6 -D__ARM_MAX_ARCH__=8
         #
         inherit_from     => [ "linux-generic32", asm("armv4_asm") ],
         perlasm_scheme   => "linux32",
     },
...

In debian packaging __ARM_MAX_ARCH__ is not set, so it is same as 
__ARM_ARCH__, which is in turn implied from compiler flags (option (a)).

On debian/armhf, default compiler flags implies __ARM_ARCH__ is 7, so 
everything is happy (neon-optimized code is compiled and cpu 
autodetection is enabled).

But on raspbian __ARM_ARCH__ is 6, and everything is sad (neon code and 
cpu autodetection are disabled).

Solution is trivial: keep -march unset (compiler-default), but 
explicitly define __ARM_MAX_ARCH__ to 8 to produce "universal build", as 
recommended in comments above (option (c)).

	"debian-armhf" => {
		inherit_from => [ "linux-armv4", "debian" ],
+		cflags => add("-D__ARM_MAX_ARCH__=8"),
	},

It should be able to still run on armv6/rpi1, but use neon-optimized 
code on anything newer.

Same applies to libssl1.0.2 package (source package openssl1.0) that is 
still used in {debian,raspbian}-stretch (notably, in curl)

Patches attached (for stretch - openssl-1.0.2q^1 and openssl-1.1.0j, and 
for buster - 1.1.1a); they are probably acceptable for upstream (debian) 
packaging inclusion, however as they bring practically nothing to 
"debian proper", I'm not sure if they will.

^1 openssl 1.0.2 required backporting commit 
cfe670732b63b875054aabd965a7bcecc6508657; beware: it is not 
straightforward, hopefully I have not messed up anything, but I don't 
have hardware to test - affected code only runs on armv8 with sha256 
extension.

Passed limited testing on rpi3b+; I don't have hardware to test on 
earlier version (or other fruit-pi).

Another option is to build and install separate `-marmv7-a -mfpu=neon` 
optimized binary in {/usr,}/lib/neon; but performance difference with 
"universal build" is very minor.
-------------- next part --------------
diff -Nru openssl1.0-1.0.2q/debian/changelog openssl1.0-1.0.2q/debian/changelog
--- openssl1.0-1.0.2q/debian/changelog	2018-12-16 23:07:51.000000000 +0300
+++ openssl1.0-1.0.2q/debian/changelog	2019-01-12 13:57:55.000000000 +0300
@@ -1,3 +1,10 @@
+openssl1.0 (1.0.2q-1~deb9u1.1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * Build universal binary for armhf.
+
+ -- Yuriy M. Kaminskiy <yumkam+debian at gmail.com>  Sat, 12 Jan 2019 13:57:55 +0300
+
 openssl1.0 (1.0.2q-1~deb9u1) stretch-security; urgency=medium
 
   * use signing-key.asc and a https links for downloads
diff -Nru openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch
--- openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch	1970-01-01 03:00:00.000000000 +0300
+++ openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch	2019-01-12 13:57:55.000000000 +0300
@@ -0,0 +1,43 @@
+From cfe670732b63b875054aabd965a7bcecc6508657 Mon Sep 17 00:00:00 2001
+From: Andy Polyakov <appro at openssl.org>
+Date: Tue, 15 Dec 2015 21:43:56 +0100
+Subject: [PATCH] sha/asm/sha256-armv4.pl: one of "universal" flags combination
+ didn't compile. (and unify table address calculation in ARMv8 code path).
+
+Reviewed-by: Tim Hudson <tjh at openssl.org>
+---
+ crypto/sha/asm/sha256-armv4.pl | 13 ++++---------
+ 1 file changed, 4 insertions(+), 9 deletions(-)
+
+Index: openssl1.0-1.0.2q/crypto/sha/asm/sha256-armv4.pl
+===================================================================
+--- openssl1.0-1.0.2q.orig/crypto/sha/asm/sha256-armv4.pl
++++ openssl1.0-1.0.2q/crypto/sha/asm/sha256-armv4.pl
+@@ -454,7 +454,8 @@ $code.=<<___;
+ 
+ .global	sha256_block_data_order_neon
+ .type	sha256_block_data_order_neon,%function
+-.align	4
++.align	5
++.skip	16
+ sha256_block_data_order_neon:
+ .LNEON:
+ 	stmdb	sp!,{r4-r12,lr}
+@@ -591,14 +592,11 @@ $code.=<<___;
+ sha256_block_data_order_armv8:
+ .LARMv8:
+ 	vld1.32	{$ABCD,$EFGH},[$ctx]
+-# ifdef __thumb2__
+-	adr	$Ktbl,.LARMv8
+-	sub	$Ktbl,$Ktbl,#.LARMv8-K256
+-# else
+-	adrl	$Ktbl,K256
+-# endif
++	sub	$Ktbl,$Ktbl,#256+32
+ 	add	$len,$inp,$len,lsl#6	@ len to point at the end of inp
++	b	.Loop_v8
+ 
++.align	4
+ .Loop_v8:
+ 	vld1.8		{@MSG[0]- at MSG[1]},[$inp]!
+ 	vld1.8		{@MSG[2]- at MSG[3]},[$inp]!
diff -Nru openssl1.0-1.0.2q/debian/patches/armhf-universal.patch openssl1.0-1.0.2q/debian/patches/armhf-universal.patch
--- openssl1.0-1.0.2q/debian/patches/armhf-universal.patch	1970-01-01 03:00:00.000000000 +0300
+++ openssl1.0-1.0.2q/debian/patches/armhf-universal.patch	2019-01-12 13:57:55.000000000 +0300
@@ -0,0 +1,13 @@
+Index: openssl1.0-1.0.2q/Configure
+===================================================================
+--- openssl1.0-1.0.2q.orig/Configure
++++ openssl1.0-1.0.2q/Configure
+@@ -379,7 +379,7 @@ my %table=(
+ "debian-alpha-ev5","gcc:${debian_cflags} -mcpu=ev5::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_RISC1 DES_UNROLL:${alpha_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-arm64","gcc:-DL_ENDIAN ${debian_cflags}::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${aarch64_asm}:linux64:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-armel","gcc:-DL_ENDIAN ${debian_cflags}::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+-"debian-armhf","gcc:-DL_ENDIAN ${debian_cflags}::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
++"debian-armhf","gcc:-DL_ENDIAN -D__ARM_MAX_ARCH__=8 ${debian_cflags}::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-amd64", "gcc:-m64 -DL_ENDIAN ${debian_cflags} -DMD32_REG_T=int::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::",
+ "debian-avr32", "gcc:-DB_ENDIAN ${debian_cflags} -fomit-frame-pointer::-D_REENTRANT::-ldl:BN_LLONG_BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-kfreebsd-amd64","gcc:-m64 -DL_ENDIAN ${debian_cflags} -DMD32_REG_T=int::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
diff -Nru openssl1.0-1.0.2q/debian/patches/series openssl1.0-1.0.2q/debian/patches/series
--- openssl1.0-1.0.2q/debian/patches/series	2018-12-14 01:28:18.000000000 +0300
+++ openssl1.0-1.0.2q/debian/patches/series	2019-01-12 13:57:55.000000000 +0300
@@ -17,3 +17,5 @@
 disable_sslv3_test.patch
 libdoc-manpgs-pod-spell.patch
 Mark-3DES-and-RC4-ciphers-as-weak.patch
+armhf-universal.patch
+0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch
-------------- next part --------------
diff -Nru openssl-1.1.1a/debian/patches/armhf-universal.patch openssl-1.1.1a/debian/patches/armhf-universal.patch
--- openssl-1.1.1a/debian/patches/armhf-universal.patch	1970-01-01 03:00:00.000000000 +0300
+++ openssl-1.1.1a/debian/patches/armhf-universal.patch	2019-01-12 14:03:28.000000000 +0300
@@ -0,0 +1,12 @@
+Index: openssl-1.1.1a/Configurations/20-debian.conf
+===================================================================
+--- openssl-1.1.1a.orig/Configurations/20-debian.conf
++++ openssl-1.1.1a/Configurations/20-debian.conf
+@@ -24,6 +24,7 @@ my %targets = (
+ 	},
+ 	"debian-armhf" => {
+ 		inherit_from => [ "linux-armv4", "debian" ],
++		cflags => add("-D__ARM_MAX_ARCH__=8"),
+ 	},
+ 	"debian-amd64" => {
+ 		inherit_from => [ "linux-x86_64", "debian" ],
diff -Nru openssl-1.1.1a/debian/patches/series openssl-1.1.1a/debian/patches/series
--- openssl-1.1.1a/debian/patches/series	2018-11-22 01:42:25.000000000 +0300
+++ openssl-1.1.1a/debian/patches/series	2019-01-12 14:03:28.000000000 +0300
@@ -4,3 +4,4 @@
 pic.patch
 c_rehash-compat.patch
 Set-systemwide-default-settings-for-libssl-users.patch
+armhf-universal.patch
-------------- next part --------------
diff -Nru openssl-1.1.0j/debian/changelog openssl-1.1.0j/debian/changelog
--- openssl-1.1.0j/debian/changelog	2018-11-29 01:43:08.000000000 +0300
+++ openssl-1.1.0j/debian/changelog	2019-01-12 13:55:19.000000000 +0300
@@ -1,3 +1,10 @@
+openssl (1.1.0j-1~deb9u1.1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * Build universal binary for armhf.
+
+ -- Yuriy M. Kaminskiy <yumkam+debian at gmail.com>  Sat, 12 Jan 2019 13:55:19 +0300
+
 openssl (1.1.0j-1~deb9u1) stretch-security; urgency=medium
 
   * Import 1.1.0j
diff -Nru openssl-1.1.0j/debian/patches/armhf-universal.patch openssl-1.1.0j/debian/patches/armhf-universal.patch
--- openssl-1.1.0j/debian/patches/armhf-universal.patch	1970-01-01 03:00:00.000000000 +0300
+++ openssl-1.1.0j/debian/patches/armhf-universal.patch	2019-01-12 13:55:19.000000000 +0300
@@ -0,0 +1,12 @@
+Index: openssl-1.1.0j/Configurations/20-debian.conf
+===================================================================
+--- openssl-1.1.0j.orig/Configurations/20-debian.conf
++++ openssl-1.1.0j/Configurations/20-debian.conf
+@@ -28,6 +28,7 @@ $debian_ldflags =~ s/\n/ /g;
+ 	},
+ 	"debian-armhf" => {
+ 		inherit_from => [ "linux-armv4", "debian" ],
++		cflags => add("-D__ARM_MAX_ARCH__=8"),
+ 	},
+ 	"debian-amd64" => {
+ 		inherit_from => [ "linux-x86_64", "debian" ],
diff -Nru openssl-1.1.0j/debian/patches/series openssl-1.1.0j/debian/patches/series
--- openssl-1.1.0j/debian/patches/series	2018-11-24 23:58:01.000000000 +0300
+++ openssl-1.1.0j/debian/patches/series	2019-01-12 13:55:19.000000000 +0300
@@ -3,3 +3,4 @@
 no-symbolic.patch
 pic.patch
 c_rehash-compat.patch
+armhf-universal.patch


More information about the Raspbian-devel mailing list