[Raspbian-devel] [PATCH] debian, raspbian, openssl and performance
Yuriy M. Kaminskiy
yumkam+debian at gmail.com
Sat Jan 12 16:16:31 GMT 2019
I recently bought Raspberry Pi 3B+, run `openssl speed`, and noticed
that performance was not what I expected it to - some time ago I looked
at openssl sources, and remembered that it had some hand-optimized neon
assembler, with bitsliced AES, chacha20, etc, and it looks like
raspbian's openssl somehow was not using it.
When I installed openssl from debian/stretch/armhf, I got "correct"
performance (up to 2x better).
Now, what's wrong, and how it can be solved.
openssl package in raspbian seems was simply recompiled from debian
source package, without source changes. Only thing that is different -
default compiler flags.
Debian package uses debian-armhf configuration:
debian/rules:
...
../Configure shared $(CONFARGS) debian-$(DEB_HOST_ARCH)-$$opt;
...
it is defined in Configurations/20-debian.conf:
...
"debian-armhf" => {
inherit_from => [ "linux-armv4", "debian" ],
},
...
And linux-armv4 is defined in Configurations/10-main.conf:
...
"linux-armv4" => {
################################################################
# Note that -march is not among compiler options in linux-armv4
# target description. Not specifying one is intentional to give
# you choice to:
#
# a) rely on your compiler default by not specifying one;
# b) specify your target platform explicitly for optimal
# performance, e.g. -march=armv6 or -march=armv7-a;
# c) build "universal" binary that targets *range* of platforms
# by specifying minimum and maximum supported architecture;
#
# As for c) option. It actually makes no sense to specify
# maximum to be less than ARMv7, because it's the least
# requirement for run-time switch between platform-specific
# code paths. And without run-time switch performance would be
# equivalent to one for minimum. Secondly, there are some
# natural limitations that you'd have to accept and respect.
# Most notably you can *not* build "universal" binary for
# big-endian platform. This is because ARMv7 processor always
# picks instructions in little-endian order. Another similar
# limitation is that -mthumb can't "cross" -march=armv6t2
# boundary, because that's where it became Thumb-2. Well, this
# limitation is a bit artificial, because it's not really
# impossible, but it's deemed too tricky to support. And of
# course you have to be sure that your binutils are actually
# up to the task of handling maximum target platform. With all
# this in mind here is an example of how to configure
# "universal" build:
#
# ./Configure linux-armv4 -march=armv6 -D__ARM_MAX_ARCH__=8
#
inherit_from => [ "linux-generic32", asm("armv4_asm") ],
perlasm_scheme => "linux32",
},
...
In debian packaging __ARM_MAX_ARCH__ is not set, so it is same as
__ARM_ARCH__, which is in turn implied from compiler flags (option (a)).
On debian/armhf, default compiler flags implies __ARM_ARCH__ is 7, so
everything is happy (neon-optimized code is compiled and cpu
autodetection is enabled).
But on raspbian __ARM_ARCH__ is 6, and everything is sad (neon code and
cpu autodetection are disabled).
Solution is trivial: keep -march unset (compiler-default), but
explicitly define __ARM_MAX_ARCH__ to 8 to produce "universal build", as
recommended in comments above (option (c)).
"debian-armhf" => {
inherit_from => [ "linux-armv4", "debian" ],
+ cflags => add("-D__ARM_MAX_ARCH__=8"),
},
It should be able to still run on armv6/rpi1, but use neon-optimized
code on anything newer.
Same applies to libssl1.0.2 package (source package openssl1.0) that is
still used in {debian,raspbian}-stretch (notably, in curl)
Patches attached (for stretch - openssl-1.0.2q^1 and openssl-1.1.0j, and
for buster - 1.1.1a); they are probably acceptable for upstream (debian)
packaging inclusion, however as they bring practically nothing to
"debian proper", I'm not sure if they will.
^1 openssl 1.0.2 required backporting commit
cfe670732b63b875054aabd965a7bcecc6508657; beware: it is not
straightforward, hopefully I have not messed up anything, but I don't
have hardware to test - affected code only runs on armv8 with sha256
extension.
Passed limited testing on rpi3b+; I don't have hardware to test on
earlier version (or other fruit-pi).
Another option is to build and install separate `-marmv7-a -mfpu=neon`
optimized binary in {/usr,}/lib/neon; but performance difference with
"universal build" is very minor.
-------------- next part --------------
diff -Nru openssl1.0-1.0.2q/debian/changelog openssl1.0-1.0.2q/debian/changelog
--- openssl1.0-1.0.2q/debian/changelog 2018-12-16 23:07:51.000000000 +0300
+++ openssl1.0-1.0.2q/debian/changelog 2019-01-12 13:57:55.000000000 +0300
@@ -1,3 +1,10 @@
+openssl1.0 (1.0.2q-1~deb9u1.1) UNRELEASED; urgency=medium
+
+ * Non-maintainer upload.
+ * Build universal binary for armhf.
+
+ -- Yuriy M. Kaminskiy <yumkam+debian at gmail.com> Sat, 12 Jan 2019 13:57:55 +0300
+
openssl1.0 (1.0.2q-1~deb9u1) stretch-security; urgency=medium
* use signing-key.asc and a https links for downloads
diff -Nru openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch
--- openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch 1970-01-01 03:00:00.000000000 +0300
+++ openssl1.0-1.0.2q/debian/patches/0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch 2019-01-12 13:57:55.000000000 +0300
@@ -0,0 +1,43 @@
+From cfe670732b63b875054aabd965a7bcecc6508657 Mon Sep 17 00:00:00 2001
+From: Andy Polyakov <appro at openssl.org>
+Date: Tue, 15 Dec 2015 21:43:56 +0100
+Subject: [PATCH] sha/asm/sha256-armv4.pl: one of "universal" flags combination
+ didn't compile. (and unify table address calculation in ARMv8 code path).
+
+Reviewed-by: Tim Hudson <tjh at openssl.org>
+---
+ crypto/sha/asm/sha256-armv4.pl | 13 ++++---------
+ 1 file changed, 4 insertions(+), 9 deletions(-)
+
+Index: openssl1.0-1.0.2q/crypto/sha/asm/sha256-armv4.pl
+===================================================================
+--- openssl1.0-1.0.2q.orig/crypto/sha/asm/sha256-armv4.pl
++++ openssl1.0-1.0.2q/crypto/sha/asm/sha256-armv4.pl
+@@ -454,7 +454,8 @@ $code.=<<___;
+
+ .global sha256_block_data_order_neon
+ .type sha256_block_data_order_neon,%function
+-.align 4
++.align 5
++.skip 16
+ sha256_block_data_order_neon:
+ .LNEON:
+ stmdb sp!,{r4-r12,lr}
+@@ -591,14 +592,11 @@ $code.=<<___;
+ sha256_block_data_order_armv8:
+ .LARMv8:
+ vld1.32 {$ABCD,$EFGH},[$ctx]
+-# ifdef __thumb2__
+- adr $Ktbl,.LARMv8
+- sub $Ktbl,$Ktbl,#.LARMv8-K256
+-# else
+- adrl $Ktbl,K256
+-# endif
++ sub $Ktbl,$Ktbl,#256+32
+ add $len,$inp,$len,lsl#6 @ len to point at the end of inp
++ b .Loop_v8
+
++.align 4
+ .Loop_v8:
+ vld1.8 {@MSG[0]- at MSG[1]},[$inp]!
+ vld1.8 {@MSG[2]- at MSG[3]},[$inp]!
diff -Nru openssl1.0-1.0.2q/debian/patches/armhf-universal.patch openssl1.0-1.0.2q/debian/patches/armhf-universal.patch
--- openssl1.0-1.0.2q/debian/patches/armhf-universal.patch 1970-01-01 03:00:00.000000000 +0300
+++ openssl1.0-1.0.2q/debian/patches/armhf-universal.patch 2019-01-12 13:57:55.000000000 +0300
@@ -0,0 +1,13 @@
+Index: openssl1.0-1.0.2q/Configure
+===================================================================
+--- openssl1.0-1.0.2q.orig/Configure
++++ openssl1.0-1.0.2q/Configure
+@@ -379,7 +379,7 @@ my %table=(
+ "debian-alpha-ev5","gcc:${debian_cflags} -mcpu=ev5::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_RISC1 DES_UNROLL:${alpha_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-arm64","gcc:-DL_ENDIAN ${debian_cflags}::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${aarch64_asm}:linux64:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-armel","gcc:-DL_ENDIAN ${debian_cflags}::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+-"debian-armhf","gcc:-DL_ENDIAN ${debian_cflags}::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
++"debian-armhf","gcc:-DL_ENDIAN -D__ARM_MAX_ARCH__=8 ${debian_cflags}::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-amd64", "gcc:-m64 -DL_ENDIAN ${debian_cflags} -DMD32_REG_T=int::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::",
+ "debian-avr32", "gcc:-DB_ENDIAN ${debian_cflags} -fomit-frame-pointer::-D_REENTRANT::-ldl:BN_LLONG_BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+ "debian-kfreebsd-amd64","gcc:-m64 -DL_ENDIAN ${debian_cflags} -DMD32_REG_T=int::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
diff -Nru openssl1.0-1.0.2q/debian/patches/series openssl1.0-1.0.2q/debian/patches/series
--- openssl1.0-1.0.2q/debian/patches/series 2018-12-14 01:28:18.000000000 +0300
+++ openssl1.0-1.0.2q/debian/patches/series 2019-01-12 13:57:55.000000000 +0300
@@ -17,3 +17,5 @@
disable_sslv3_test.patch
libdoc-manpgs-pod-spell.patch
Mark-3DES-and-RC4-ciphers-as-weak.patch
+armhf-universal.patch
+0001-sha-asm-sha256-armv4.pl-one-of-universal-flags-combi.patch
-------------- next part --------------
diff -Nru openssl-1.1.1a/debian/patches/armhf-universal.patch openssl-1.1.1a/debian/patches/armhf-universal.patch
--- openssl-1.1.1a/debian/patches/armhf-universal.patch 1970-01-01 03:00:00.000000000 +0300
+++ openssl-1.1.1a/debian/patches/armhf-universal.patch 2019-01-12 14:03:28.000000000 +0300
@@ -0,0 +1,12 @@
+Index: openssl-1.1.1a/Configurations/20-debian.conf
+===================================================================
+--- openssl-1.1.1a.orig/Configurations/20-debian.conf
++++ openssl-1.1.1a/Configurations/20-debian.conf
+@@ -24,6 +24,7 @@ my %targets = (
+ },
+ "debian-armhf" => {
+ inherit_from => [ "linux-armv4", "debian" ],
++ cflags => add("-D__ARM_MAX_ARCH__=8"),
+ },
+ "debian-amd64" => {
+ inherit_from => [ "linux-x86_64", "debian" ],
diff -Nru openssl-1.1.1a/debian/patches/series openssl-1.1.1a/debian/patches/series
--- openssl-1.1.1a/debian/patches/series 2018-11-22 01:42:25.000000000 +0300
+++ openssl-1.1.1a/debian/patches/series 2019-01-12 14:03:28.000000000 +0300
@@ -4,3 +4,4 @@
pic.patch
c_rehash-compat.patch
Set-systemwide-default-settings-for-libssl-users.patch
+armhf-universal.patch
-------------- next part --------------
diff -Nru openssl-1.1.0j/debian/changelog openssl-1.1.0j/debian/changelog
--- openssl-1.1.0j/debian/changelog 2018-11-29 01:43:08.000000000 +0300
+++ openssl-1.1.0j/debian/changelog 2019-01-12 13:55:19.000000000 +0300
@@ -1,3 +1,10 @@
+openssl (1.1.0j-1~deb9u1.1) UNRELEASED; urgency=medium
+
+ * Non-maintainer upload.
+ * Build universal binary for armhf.
+
+ -- Yuriy M. Kaminskiy <yumkam+debian at gmail.com> Sat, 12 Jan 2019 13:55:19 +0300
+
openssl (1.1.0j-1~deb9u1) stretch-security; urgency=medium
* Import 1.1.0j
diff -Nru openssl-1.1.0j/debian/patches/armhf-universal.patch openssl-1.1.0j/debian/patches/armhf-universal.patch
--- openssl-1.1.0j/debian/patches/armhf-universal.patch 1970-01-01 03:00:00.000000000 +0300
+++ openssl-1.1.0j/debian/patches/armhf-universal.patch 2019-01-12 13:55:19.000000000 +0300
@@ -0,0 +1,12 @@
+Index: openssl-1.1.0j/Configurations/20-debian.conf
+===================================================================
+--- openssl-1.1.0j.orig/Configurations/20-debian.conf
++++ openssl-1.1.0j/Configurations/20-debian.conf
+@@ -28,6 +28,7 @@ $debian_ldflags =~ s/\n/ /g;
+ },
+ "debian-armhf" => {
+ inherit_from => [ "linux-armv4", "debian" ],
++ cflags => add("-D__ARM_MAX_ARCH__=8"),
+ },
+ "debian-amd64" => {
+ inherit_from => [ "linux-x86_64", "debian" ],
diff -Nru openssl-1.1.0j/debian/patches/series openssl-1.1.0j/debian/patches/series
--- openssl-1.1.0j/debian/patches/series 2018-11-24 23:58:01.000000000 +0300
+++ openssl-1.1.0j/debian/patches/series 2019-01-12 13:55:19.000000000 +0300
@@ -3,3 +3,4 @@
no-symbolic.patch
pic.patch
c_rehash-compat.patch
+armhf-universal.patch
More information about the Raspbian-devel
mailing list