Bug#766695: atlas: add ppc64el support

Mauricio Faria de Oliveira mauricfo at linux.vnet.ibm.com
Fri Oct 24 22:40:18 UTC 2014


Package: src:atlas
Version: 3.10.2-4
Tags: patch
User: debian-powerpc at lists.debian.org
Usertags: ppc64el

Hi atlas maintainers,

This patch adds support for the ppc64el port.

It contains:
1) the patch-set authored by Michael Normand  (submitted upstream, also
    documented in [1])
2) a packaging change, to restrict the patch-set to ppc64el builds only;
    it touches common powerpc code, which is certainly not desirable for
    other powerpc-based ports at this moment (bugs/freeze/jessie).
3) an archdef tarball (attached separately)

May you please consider it for an upload? (specially for making jessie)

Thank you,

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GENERIC64LE.tar.bz2
Type: application/x-bzip
Size: 7860 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/debian-science-maintainers/attachments/20141024/400f3f0c/attachment-0001.bin>
-------------- next part --------------
diff -Nru atlas-3.10.2/debian/archdefs/README atlas-3.10.2/debian/archdefs/README
--- atlas-3.10.2/debian/archdefs/README	2014-07-12 07:23:26.000000000 -0300
+++ atlas-3.10.2/debian/archdefs/README	2014-10-24 19:45:37.000000000 -0200
@@ -16,5 +16,6 @@
 - mips: ATLAS 3.10.1 / gabrielli.debian.org / sid / 2013-07-27
 - mipsel: ATLAS 3.10.1 / eder.debian.org / sid / 2013-06-07
 - powerpc: ATLAS 3.10.1 / partch.debian.org / sid / 2013-06-06
+- ppc64el: ATLAS 3.10.2 / pastel.debian.net / sid / 2014-10-24
 - s390x: ATLAS 3.10.1 / zelenka.debian.org / sid / 2013-06-06
 - sparc: ATLAS 3.10.1 / smetana.debian.org / wheezy / 2013-06-06
diff -Nru atlas-3.10.2/debian/changelog atlas-3.10.2/debian/changelog
--- atlas-3.10.2/debian/changelog	2014-10-15 16:35:41.000000000 -0300
+++ atlas-3.10.2/debian/changelog	2014-10-24 19:45:37.000000000 -0200
@@ -1,3 +1,15 @@
+atlas (3.10.2-4ppc64el1) UNRELEASED; urgency=medium
+
+  * Add ppc64el support (work in progress)
+    - debian/patches/ppc64el/ (thanks, Michael Normand et al).
+    - debian/rules:  restrict ppc64el patches to ppc64el builds.
+    - debian/rules:  different 'GENERIC' first number in ARCHs due to POWER8.
+    - debian/archdefs/ppc64el/GENERIC64LE.tar.bz2: archdefs/timings,
+      currently the same file for POWER7, POWER7+ and POWER8 systems.
+    - debian/archdefs/README: updated accordingly.
+
+ -- Mauricio Faria de Oliveira <mauricfo at linux.vnet.ibm.com>  Thu, 24 Oct 2014 20:02:00 -0200
+
 atlas (3.10.2-4) unstable; urgency=medium
 
   [ Alastair McKinstry ]
diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch
--- atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch	1969-12-31 21:00:00.000000000 -0300
+++ atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch	2014-10-24 19:45:37.000000000 -0200
@@ -0,0 +1,38 @@
+Origin: https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c43
+Forwarded: http://sourceforge.net/p/math-atlas/patches/66/
+Description: Append 'LE' to archdef on little-endian PowerPC64
+ For more details, see:
+ https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40
+Last-Update: 2014-10-24
+Subject: atlas new archdef for ppc64le
+From: Michel Normand <normand at linux.vnet.ibm.com>
+Date: Sun, 13 Jun 2014 18:02:47 +0200
+
+Need to define different archdef names
+for ppc64 (that is Big Endian) and ppc64le (that is Little Endian).
+This is already done upstream in atlas 3.11.30 with issue
+https://sourceforge.net/p/math-atlas/patches/66/
+
+Required at least as long as I need the bypass of
+atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
+
+Signed-off-by: Michel Normand <normand at linux.vnet.ibm.com>
+---
+ CONFIG/src/SpewMakeInc.c |    4 ++++
+ 1 file changed, 4 insertions(+)
+
+Index: ATLAS/CONFIG/src/SpewMakeInc.c
+===================================================================
+--- ATLAS.orig/CONFIG/src/SpewMakeInc.c
++++ ATLAS/CONFIG/src/SpewMakeInc.c
+@@ -542,6 +542,10 @@ int main(int nargs, char **args)
+    fprintf(fpout, "#  -------------------------------------------------\n");
+    fprintf(fpout, "   ARCH = %s", machnam[mach]);
+    fprintf(fpout, "%d", ptrbits);
++   /* for ppc64le archi add 'LE' characters */
++   #if defined(__powerpc64__) && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
++      fprintf(fpout, "%s", "LE");
++   #endif
+    if (ISAX)
+       fprintf(fpout, "%s", ISAXNAM[ISAX]);
+    if (!USEIEEE)
diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch
--- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch	1969-12-31 21:00:00.000000000 -0300
+++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch	2014-10-24 19:45:37.000000000 -0200
@@ -0,0 +1,138 @@
+Origin: https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c37
+Forwarded: http://sourceforge.net/p/math-atlas/patches/67/
+Description: Add IBM POWER8 pieces
+ The original patch for 3.10.2 was backported to apply on top
+ of 'debian/patches/generic.diff' - trivial changes to hunks
+ of 'ATLAS/CONFIG/include/atlconf.h'.
+ For more details, see:
+ https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40
+Last-Update: 2014-10-24
+From: Michel Normand <normand at linux.vnet.ibm.com>
+Subject: atlas.3.10.2 add power8 cpu
+Date: Thu, 18 Sep 2014 15:13:24 +0200
+
+atlas.3.10.2 add Power8 cpu
+
+Signed-off-by: Michel Normand <normand at linux.vnet.ibm.com>
+---
+ CONFIG/ARCHS/Make.ext               |    7 +++++++
+ CONFIG/include/atlconf.h            |    6 +++---
+ CONFIG/src/atlcomp.txt              |    6 ++++++
+ CONFIG/src/backend/archinfo_aix.c   |    2 ++
+ CONFIG/src/backend/archinfo_linux.c |    1 +
+ include/atlas_pca.h                 |    2 +-
+ 6 files changed, 20 insertions(+), 4 deletions(-)
+
+Index: ATLAS/CONFIG/ARCHS/Make.ext
+===================================================================
+--- ATLAS.orig/CONFIG/ARCHS/Make.ext
++++ ATLAS/CONFIG/ARCHS/Make.ext
+@@ -33,6 +33,7 @@ files = AMD64K10h32SSE3.tar.bz2 AMD64K10
+         MIPSR1xK64.tar.bz2 Makefile P432SSE2.tar.bz2 P4E32SSE3.tar.bz2 \
+         P4E64SSE3.tar.bz2 PIII32SSE1.tar.bz2 POWER432.tar.bz2 \
+         POWER464.tar.bz2 POWER564.tar.bz2 POWER764VSX.tar.bz2 \
++        POWER864VSX.tar.bz2 \
+         PPCG432AltiVec.tar.bz2 PPCG532AltiVec.tar.bz2 PPCG564AltiVec.tar.bz2 \
+         PPRO32.tar.bz2 USIII32.tar.bz2 USIII64.tar.bz2 USIV32.tar.bz2 \
+         USIV64.tar.bz2 UST232.tar.bz2 UST264.tar.bz2 atlas_test1.1.3.tar.bz2 \
+@@ -308,6 +309,12 @@ POWER764VSX.tar.bz2 : $(basdr)/POWER764V
+            /tmp/POWER764VSX.tar POWER764VSX
+ 	bzip2 /tmp/POWER764VSX.tar
+ 	mv /tmp/POWER764VSX.tar.bz2 ./.
++POWER864VSX.tar.bz2 : $(basdr)/POWER864VSX
++	- rm -f /tmp/POWER864VSX.tar /tmp/POWER864VSX.tar.bz2
++	cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \
++           /tmp/POWER864VSX.tar POWER864VSX
++	bzip2 /tmp/POWER864VSX.tar
++	mv /tmp/POWER864VSX.tar.bz2 ./.
+ IBMz1032.tar.bz2 : $(basdr)/IBMz1032
+ 	- rm -f /tmp/IBMz1032.tar /tmp/IBMz1032.tar.bz2
+ 	cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \
+Index: ATLAS/CONFIG/include/atlconf.h
+===================================================================
+--- ATLAS.orig/CONFIG/include/atlconf.h
++++ ATLAS/CONFIG/include/atlconf.h
+@@ -18,10 +18,10 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
+ enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS,
+               AFARM, AFS390};
+ 
+-#define NMACH 53
++#define NMACH 54
+ static char *machnam[NMACH] =
+    {"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
+-    "POWER6", "POWER7", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
++    "POWER6", "POWER7", "POWER8", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
+     "x86x87", "x86SSE1", "x86SSE2", "x86SSE3",
+     "P5", "P5MMX", "PPRO", "PII", "PIII", "PM", "CoreSolo",
+     "CoreDuo", "Core2Solo", "Core2", "Corei1", "Corei2", "Corei3",
+@@ -31,7 +31,7 @@ static char *machnam[NMACH] =
+     "USI", "USII", "USIII", "USIV", "UST1", "UST2", "UnknownUS",
+     "MIPSR1xK", "MIPSICE9", "ARMv7", "GENERIC"};
+ enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
+-               IbmPwr6, IbmPwr7, Pwre6500,
++               IbmPwr6, IbmPwr7, IbmPwr8, Pwre6500,
+                IbmZ9, IbmZ10, IbmZ196,  /* s390(x) in Linux */
+                x86x87, x86SSE1, x86SSE2, x86SSE3, /* generic targets */
+                IntP5, IntP5MMX, IntPPRO, IntPII, IntPIII, IntPM, IntCoreS,
+Index: ATLAS/CONFIG/src/atlcomp.txt
+===================================================================
+--- ATLAS.orig/CONFIG/src/atlcomp.txt
++++ ATLAS/CONFIG/src/atlcomp.txt
+@@ -190,6 +190,10 @@ MACH=PPCG5 OS=ALL LVL=1000 COMPS=dmc,icc
+    'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2'
+ MACH=PPCG5 OS=ALL LVL=1000 COMPS=skc
+    'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2 -mvrsave'
++MACH=POWER8 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
++   'gcc' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops'
++MACH=POWER8 OS=ALL LVL=1010 COMPS=f77
++   'gfortran' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops'
+ MACH=POWER7 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
+    'gcc' '-O2 -mvsx -mcpu=power7 -mtune=power7 -m64 -mvrsave -funroll-all-loops'
+ MACH=POWER7 OS=ALL LVL=1010 COMPS=f77
+@@ -210,6 +214,8 @@ MACH=POWER4 OS=ALL LVL=1010 COMPS=icc,dm
+    'gcc' '-mcpu=power4 -mtune=power4 -O3 -fno-schedule-insns -fno-rerun-loop-opt'
+ MACH=POWER4 OS=ALL LVL=1010 COMPS=f77
+    'xlf' '-qtune=pwr4 -qarch=pwr4 -O3 -qmaxmem=-1 -qfloat=hsflt'
++MACH=POWER8 OS=ALL LVL=1010 COMPS=f77
++   'xlf' '-qtune=pwr8 -qarch=pwr8 -O3 -qmaxmem=-1 -qfloat=hsflt'
+ #
+ # IBM System z or zEnterprise.
+ # These compiler flags given by IBM; -O3 -funroll-loops are chosen because
+Index: ATLAS/CONFIG/src/backend/archinfo_linux.c
+===================================================================
+--- ATLAS.orig/CONFIG/src/backend/archinfo_linux.c
++++ ATLAS/CONFIG/src/backend/archinfo_linux.c
+@@ -77,6 +77,7 @@ enum MACHTYPE ProbeArch()
+          else if (strstr(res, "7455")) mach = PPCG4;
+          else if (strstr(res, "PPC970FX")) mach = PPCG5;
+          else if (strstr(res, "PPC970MP")) mach = PPCG5;
++         else if (strstr(res, "POWER8")) mach = IbmPwr8;
+          else if (strstr(res, "POWER7")) mach = IbmPwr7;
+          else if (strstr(res, "POWER6")) mach = IbmPwr6;
+          else if (strstr(res, "POWER5")) mach = IbmPwr5;
+Index: ATLAS/include/atlas_pca.h
+===================================================================
+--- ATLAS.orig/include/atlas_pca.h
++++ ATLAS/include/atlas_pca.h
+@@ -26,7 +26,7 @@
+    #endif
+ #elif defined(ATL_ARCH_POWER3) || defined(ATL_ARCH_POWER4) || \
+       defined(ATL_ARCH_POWER5) || defined(ATL_ARCH_POWER6) || \
+-      defined(ATL_ARCH_POWER7)
++      defined(ATL_ARCH_POWER7) || defined(ATL_ARCH_POWER8)
+    #ifdef __GNUC__
+       #define ATL_membarrier __asm__ __volatile__ ("dcs")
+ /*      #define ATL_USEPCA 1 */
+Index: ATLAS/CONFIG/src/backend/archinfo_aix.c
+===================================================================
+--- ATLAS.orig/CONFIG/src/backend/archinfo_aix.c
++++ ATLAS/CONFIG/src/backend/archinfo_aix.c
+@@ -67,6 +67,8 @@ enum MACHTYPE ProbeArch()
+       {
+          if (strstr(res, "PowerPC_POWER5"))
+             mach = IbmPwr5;
++         else if (strstr(res, "PowerPC_POWER8"))
++            mach = IbmPwr8;
+          else if (strstr(res, "PowerPC_POWER7"))
+             mach = IbmPwr7;
+          else if (strstr(res, "PowerPC_POWER6"))
diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch
--- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch	1969-12-31 21:00:00.000000000 -0300
+++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch	2014-10-24 19:45:37.000000000 -0200
@@ -0,0 +1,70 @@
+Origin: http://sourceforge.net/p/math-atlas/patches/65/#3cb1
+Forwarded: http://sourceforge.net/p/math-atlas/patches/65/
+Description: ELFv2 ABI changes (2/3)
+ For more details, see:
+ https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40
+Last-Update: 2014-10-24
+From: Michel Normand <normand at linux.vnet.ibm.com>
+Subject: atlas.3.10.2 ppc64le abiv2 step2 patch
+Date: Mon, 28 Jul 2014 04:29:05 -0400
+
+atlas.ppc64le abiv2 step2 complete the changes already present in atlas 3.10.2
+* still some files with opd ABI V1 to be disabled for ABI V2
+ tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
+ tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
+ tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
+
+Signed-off-by: Michel Normand <normand at linux.vnet.ibm.com>
+---
+ tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c |    2 +-
+ tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c |    3 ++-
+ tune/blas/gemm/CASES/ATL_smm4x4x128_av.c |    2 +-
+ 3 files changed, 4 insertions(+), 3 deletions(-)
+
+Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
+@@ -268,7 +268,7 @@ Mjoin(.,ATL_USERMM):
+ 	.globl  Mjoin(_,ATL_USERMM)
+ Mjoin(_,ATL_USERMM):
+    #else
+-      #if defined(ATL_USE64BITS)
++      #if defined(ATL_USE64BITS) && _CALL_ELF != 2
+ /*
+  *      Official Program Descripter section, seg fault w/o it on Linux/PPC64
+  */
+Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
+@@ -202,7 +202,7 @@ Mjoin(.,ATL_USERMM):
+ 	.globl  Mjoin(_,ATL_USERMM)
+ Mjoin(_,ATL_USERMM):
+    #else
+-      #if defined(ATL_USE64BITS)
++      #if defined(ATL_USE64BITS) && _CALL_ELF != 2
+ /*
+  *      Official Program Descripter section, seg fault w/o it on Linux/PPC64
+  */
+@@ -257,6 +257,7 @@ ATL_USERMM:
+    #endif
+ #endif
+ 
++
+ #if defined (ATL_USE64BITS)
+         ld      pC0, 120(r1)
+         ld      incCn, 128(r1)
+Index: ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
+@@ -196,7 +196,7 @@ void ATL_USERMM(const int M, const int N
+ 	.globl  Mjoin(_,ATL_USERMM)
+ Mjoin(_,ATL_USERMM):
+ #else
+-   #if defined(ATL_USE64BITS)
++   #if defined(ATL_USE64BITS) && _CALL_ELF != 2
+ /*
+  *      Official Program Descripter section, seg fault w/o it on Linux/PPC64
+  */
diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch
--- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch	1969-12-31 21:00:00.000000000 -0300
+++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch	2014-10-24 19:45:37.000000000 -0200
@@ -0,0 +1,191 @@
+Origin: http://sourceforge.net/p/math-atlas/patches/65/#3cb1
+Forwarded: http://sourceforge.net/p/math-atlas/patches/65/
+Description: ELFv2 ABI changes (3/3)
+ For more details, see:
+ https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40
+Last-Update: 2014-10-24
+From: Michel Normand <normand at linux.vnet.ibm.com>
+Subject: atlas.3.10.2 ppc64le abiv2 step3
+Date: Tue, 29 Jul 2014 15:33:18 +0200
+
+atlas.3.10.2 ppc64le abiv2 step3
+* change offsets of parameters read from stack to avoid some segfaults.
+  (values changes 120 => 104 and 128 => 112 identified by gdb investigation)
+
+Despite this step3 patch there are two Remaining problems for ppc64le archi:
+* TODO: still have seg-faults in console during build/check
+but is not critical (without make check) and rpm are generated on fedora.
+unable to investigate because of problem tracked by issue 950
+https://sourceforge.net/p/math-atlas/support-requests/950/
+
+* TODO: make check failure because xsslvtst execution failure
+related to vector assembly code that assumes big-endian env
+as written in ATL_cmm4x4x128_av.c and ATL_smm4x4x128_av.c.
+Would need significant work to support little-endian as per
+endianess comments of all PowerPC vector instructions in:
+https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D/$file/vector_simd_pem.ppc.2005AUG23.pdf
+
+Signed-off-by: Michel Normand <normand at linux.vnet.ibm.com>
+---
+ tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c |    7 +++++++
+ tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c |    7 +++++++
+ tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c |    7 +++++++
+ tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c |   17 ++++++++++++++++-
+ tune/blas/gemm/CASES/ATL_smm4x4x128_av.c |   21 +++++++++++++++++++++
+ 5 files changed, 58 insertions(+), 1 deletion(-)
+
+Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
+@@ -405,8 +405,15 @@ Mjoin(_,ATL_USERMM):
+  */
+ #ifdef ATL_GAS_LINUX_PPC
+    #ifdef ATL_USE64BITS
++      #if _CALL_ELF == 2
++      /* ABIv2 */
++        ld      pC0, 104(r1)
++        ld      incCn, 112(r1)
++      #else
++      /* ABIv1 */
+ 	ld 	pC0, 120(r1)
+ 	ld 	incCn, 128(r1)
++      #endif
+    #else
+ 	lwz	incCn, FSIZE+8(r1)
+    #endif
+Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
+@@ -324,8 +324,15 @@ ATL_USERMM:
+ #endif
+ 
+ #ifdef ATL_USE64BITS
++#if _CALL_ELF == 2
++/* ABIv2 */
++        ld      pC0, 104(r1)
++        ld      incCn, 112(r1)
++#else
++/* ABIv1 */
+         ld      pC0, 120(r1)
+         ld      incCn, 128(r1)
++#endif
+ #elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC)
+         lwz     pC0, 68(r1)
+         lwz     incCn,  72(r1)
+Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
+@@ -170,13 +170,21 @@ void ATL_USERMM(const int M, const int N
+                 const TYPE beta, TYPE *C, const int ldc)
+                                   (r10)    8(r1)
+ *******************************************************************************
+-64 bit ABIs:
++64 bit ABIv1s:
+                          r3           r4           r5             r6/f1
+ void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha,
+                            r7             r8             r9            r10
+                 const TYPE *A, const int lda, const TYPE *B, const int ldb,
+                              f2   120(r1)        128(r1)
+                 const TYPE beta, TYPE *C, const int ldc)
++
++64 bit ABIv2s:
++                         r3           r4           r5             r6/f1
++void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha,
++                           r7             r8             r9            r10
++                const TYPE *A, const int lda, const TYPE *B, const int ldb,
++                             f2   104(r1)        112(r1)
++                const TYPE beta, TYPE *C, const int ldc)
+ #endif
+ #ifdef ATL_AS_AIX_PPC
+         .csect .text[PR]
+@@ -259,8 +267,15 @@ ATL_USERMM:
+ 
+ 
+ #if defined (ATL_USE64BITS)
++#if _CALL_ELF == 2
++/* ABIv2 */
++        ld      pC0, 104(r1)
++        ld      incCn, 112(r1)
++#else
++/* ABIv1 */
+         ld      pC0, 120(r1)
+         ld      incCn, 128(r1)
++#endif
+ #elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC)
+         lwz     pC0, 68(r1)
+         lwz     incCn,  72(r1)
+Index: ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
+@@ -221,8 +221,15 @@ ATL_USERMM:
+  *      kernel instead
+  */
+ #if defined (ATL_USE64BITS)
++#if _CALL_ELF == 2
++/* ABIv2 */
++        ld      r10, 104(r1)
++        ld      r5, 112(r1)
++#else
++/* ABIv1 */
+         ld      r10, 120(r1)
+         ld      r5, 128(r1)
++#endif
+ #elif defined(ATL_AS_OSX_PPC)
+         lwz     r10, 60(r1)
+         lwz     r5,  64(r1)
+@@ -285,8 +292,15 @@ ATL_USERMM:
+         eqv     r0, r0, r0      /* all 1s */
+         ATL_WriteVRSAVE(r0)     /* signal we use all vector regs */
+ #if defined (ATL_USE64BITS)
++#if _CALL_ELF == 2
++        /* ABIv2 */
++        ld      pC0, FSIZE+104(r1)
++        ld      ldc, FSIZE+112(r1)
++#else
++        /* ABIv1 */
+         ld      pC0, FSIZE+120(r1)
+         ld      ldc, FSIZE+128(r1)
++#endif
+ #elif defined(ATL_AS_OSX_PPC)
+         lwz     pC0, FSIZE+60(r1)
+         lwz     ldc,  FSIZE+64(r1)
+@@ -4258,8 +4272,15 @@ UNALIGNED_C:
+         eqv     r0, r0, r0      /* all 1s */
+         ATL_WriteVRSAVE(r0)     /* signal we use all vector regs */
+ #if defined (ATL_USE64BITS)
++#if _CALL_ELF == 2
++        /* ABIv2 */
++        ld      pC0, FSIZE+104(r1)
++        ld      ldc, FSIZE+112(r1)
++#else
++        /* ABIv1 */
+         ld      pC0, FSIZE+120(r1)
+         ld      ldc, FSIZE+128(r1)
++#endif
+ #elif defined(ATL_AS_OSX_PPC)
+         lwz     pC0, FSIZE+60(r1)
+         lwz     ldc,  FSIZE+64(r1)
+Index: ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
++++ ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
+@@ -258,8 +258,15 @@ ATL_USERMM:
+         eqv     r0, r0, r0      /* all 1s */
+         ATL_WriteVRSAVE(r0)     /* signal we use all vector regs */
+ #if defined (ATL_USE64BITS)
++#if _CALL_ELF == 2
++/* ABIv2 */
++        ld      pC0, FSIZE+104(r1)
++        ld      ldc, FSIZE+112(r1)
++#else
++/* ABIv1 */
+         ld      pC0, FSIZE+120(r1)
+         ld      ldc, FSIZE+128(r1)
++#endif
+ #elif defined(ATL_AS_OSX_PPC)
+         lwz     pC0, FSIZE+60(r1)
+         lwz     ldc,  FSIZE+64(r1)
diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
--- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch	1969-12-31 21:00:00.000000000 -0300
+++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch	2014-10-24 19:45:37.000000000 -0200
@@ -0,0 +1,163 @@
+Origin: https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c42
+Forwarded: http://sourceforge.net/p/math-atlas/patches/65/
+Description: Skip optimizations for big-endian PowerPC.
+ Some of the existing optimized files/cases for PowerPC
+ contain assembly instructions with implicit big-endian
+ behavior - thus incorrect for the little-endian mode -
+ incurring tests failures during the build (i.e., FTBFS).
+ This is being worked on; this is the workaround for now.
+ Author's comments in patch 'abiv2 step3'.
+ For more details, see:
+ https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40
+Last-Update: 2014-10-24
+From: Michel Normand <normand at linux.vnet.ibm.com>
+Subject: atlas.3.10.2 ppc64le do not use files with lvx
+Date: Tue, 12 Aug 2014 16:07:06 +0200
+
+ppc64le do not use files with lvx
+This is a temporary patch as long as the related files
+are not ported yet to ppc64 little-endian.
+
+Warning: patch to be applied only for ppc64le architecture
+and will also need atlas-new_archdef_for_ppc64le.patch
+
+Signed-off-by: Michel Normand <normand at linux.vnet.ibm.com>
+---
+ tune/blas/gemm/CASES/ccases.flg |    6 +-----
+ tune/blas/gemm/CASES/dcases.flg |    8 +-------
+ tune/blas/gemm/CASES/dcases.vnb |    4 ----
+ tune/blas/gemm/CASES/scases.flg |    9 +--------
+ tune/blas/gemm/CASES/scases.vnb |    3 ---
+ tune/blas/gemm/CASES/zcases.flg |    8 +-------
+ 6 files changed, 4 insertions(+), 34 deletions(-)
+
+Index: ATLAS/tune/blas/gemm/CASES/ccases.flg
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/ccases.flg
++++ ATLAS/tune/blas/gemm/CASES/ccases.flg
+@@ -1,5 +1,5 @@
+ <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
+-24
++22
+ 304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c     "R. Clint Whaley" \
+ gcc
+ -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O
+@@ -48,13 +48,9 @@ gcc
+ 328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c  "R. Clint Whaley" \
+ gcc
+ -fomit-frame-pointer -O2 -fno-tree-loop-optimize
+-329 192 4 4 4 1 16 4 4 4 ATL_cmm4x4x128_av.c "R. Clint Whaley" \
+-gcc
+--x assembler-with-cpp
+ 331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c  "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mips4
+-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c  "IBM"
+ 333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mfpu=vfpv3
+Index: ATLAS/tune/blas/gemm/CASES/scases.flg
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/scases.flg
++++ ATLAS/tune/blas/gemm/CASES/scases.flg
+@@ -1,5 +1,5 @@
+ <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
+-25
++22
+ 304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c     "R. Clint Whaley" \
+ gcc
+ -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O
+@@ -48,16 +48,9 @@ gcc
+ 328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c  "R. Clint Whaley" \
+ gcc
+ -fomit-frame-pointer -O2 -fno-tree-loop-optimize
+-329 192 4 4 4 1 16 4 4 4 ATL_smm4x4x128_av.c "R. Clint Whaley" \
+-gcc
+--x assembler-with-cpp
+-330 200 92 92 92 1 16 92 92 92 ATL_smm4x4x128_av.c "R. Clint Whaley" \
+-gcc
+--x assembler-with-cpp
+ 331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c  "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mips4
+-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c  "IBM"
+ 333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mfpu=vfpv3
+Index: ATLAS/tune/blas/gemm/CASES/scases.vnb
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/scases.vnb
++++ ATLAS/tune/blas/gemm/CASES/scases.vnb
+@@ -31,9 +31,6 @@
+ # Defaults: TA='t', TB='n', SSE=0, X87=0, LDBOT=1, RTKU=0, AOUTER=0,
+ #           KBMAX=KU, KBMIN=KU, BETAN1=0, RTMN=1
+ #
+-ID=1  ROUT='ATL_smm4x4x128_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=4 \
+-      LDKB=1 LDBOT=1 KBMIN=4 KBMAX=128 ASM=GAS_PPC \
+-      COMP='gcc' FLAGS='-x assembler-with-cpp'
+ ID=2  ROUT='ATL_smm4x4x16_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=16 \
+       LDKB=1 LDBOT=0 KBMIN=16 KBMAX=2048 ASM=GAS_SPARC \
+       COMP='gcc' FLAGS='-x assembler-with-cpp'
+Index: ATLAS/tune/blas/gemm/CASES/dcases.flg
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/dcases.flg
++++ ATLAS/tune/blas/gemm/CASES/dcases.flg
+@@ -1,5 +1,5 @@
+ <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
+-32
++30
+ 306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c     "R. Clint Whaley" \
+ gcc
+ -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2
+@@ -79,12 +79,6 @@ gcc
+ 336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c  "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mips4
+-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \
+-gcc
+--x assembler-with-cpp
+-338 192 8 4 2 1 0 8 4 2  ATL_dmm8x4x2_vsx.c  "IBM" \
+-gcc
+--O3 -mvsx
+ 339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mfpu=vfpv3
+Index: ATLAS/tune/blas/gemm/CASES/dcases.vnb
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/dcases.vnb
++++ ATLAS/tune/blas/gemm/CASES/dcases.vnb
+@@ -53,10 +53,6 @@ ID=6  ROUT='ATL_dmm4x1x90_x87.c' AUTH='R
+ ID=7  ROUT='ATL_dmm8x1x120_sse2.c' AUTH='R. Clint Whaley' \
+       MU=8 NU=1 KU=1 KBMAX=512 ASM=GAS_x8664 BETAN1=1 \
+       COMP='gcc' FLAGS='-m64 -x assembler-with-cpp'
+-ID=70 ROUT='ATL_dmm4x4x80_ppc.c' AUTH='R. Clint Whaley' TA='T', TB='N' \
+-      MU=4 NU=4 KU=1 KBMIN=1 KBMAX=80 ASM=GAS_PPC BETAN1=0 LDBOT=0 \
+-      LDAB=0 LDISKB=1 RTN=1 RTM=1 RTK=0 \
+-      COMP='gcc' FLAGS='-x assembler-with-cpp'
+ ID=80 ROUT='ATL_dmm4x4x16r8_US.c' AUTH='R. Clint Whaley' TA='T', TB='N' \
+       MU=4 NU=4 KU=24 KBMIN=24 KBMAX=512 ASM=GAS_SPARC BETAN1=0 \
+       LDAB=0 RTK=1 RTN=1 RTM=1 LDBOT=0 LDISKB=1 LDAB=1 \
+Index: ATLAS/tune/blas/gemm/CASES/zcases.flg
+===================================================================
+--- ATLAS.orig/tune/blas/gemm/CASES/zcases.flg
++++ ATLAS/tune/blas/gemm/CASES/zcases.flg
+@@ -1,5 +1,5 @@
+ <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
+-31
++29
+ 306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c     "R. Clint Whaley" \
+ gcc
+ -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2
+@@ -76,12 +76,6 @@ gcc
+ 336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c  "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mips4
+-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \
+-gcc
+--x assembler-with-cpp
+-338 192 8 4 2 1 0 8 4 2  ATL_dmm8x4x2_vsx.c  "IBM" \
+-gcc
+--O3 -mvsx
+ 339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \
+ gcc
+ -x assembler-with-cpp -mfpu=vfpv3
diff -Nru atlas-3.10.2/debian/patches/series.ppc64el atlas-3.10.2/debian/patches/series.ppc64el
--- atlas-3.10.2/debian/patches/series.ppc64el	1969-12-31 21:00:00.000000000 -0300
+++ atlas-3.10.2/debian/patches/series.ppc64el	2014-10-24 19:45:37.000000000 -0200
@@ -0,0 +1,5 @@
+ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch
+ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch
+ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
+ppc64el/atlas-new_archdef_for_ppc64le.patch
+ppc64el/atlas.3.10.2-add_power8_cpu.patch
diff -Nru atlas-3.10.2/debian/rules atlas-3.10.2/debian/rules
--- atlas-3.10.2/debian/rules	2014-10-15 16:36:14.000000000 -0300
+++ atlas-3.10.2/debian/rules	2014-10-24 19:45:37.000000000 -0200
@@ -17,6 +17,8 @@
 # - 51 means ARMv7: for armhf (but not for armel, which is ARM >= v4)
 # - 52 means GENERIC: the same than 0 (UNKNOWN), except that it does not try autodetection
 #   See debian/patches/generic.diff
+#   [ppc64el] *53* means GENERIC, as POWER8 processor is introduced.
+#   See debian/patches/ppc64el/
 # Second number in ARCHS:
 # - 1 means no instruction set extension
 # - 384 means SSE1+SSE2 (always available on amd64)
@@ -31,8 +33,12 @@
 else ifeq ($(DEB_HOST_ARCH),armhf)
 ARCHS=base_51_1
 else
+ifeq ($(DEB_HOST_ARCH),ppc64el)
+ARCHS=base_53_1
+else
 ARCHS=base_52_1
 endif
+endif
 
 # Pointer bitwidth
 MODE_BITWIDTH = $(shell dpkg-architecture -qDEB_HOST_ARCH_BITS)
@@ -86,6 +92,37 @@
 		(test -f CONFIG/ARCHS/ARMv732.tar.bz2.old && mv CONFIG/ARCHS/ARMv732.tar.bz2.old CONFIG/ARCHS/ARMv732.tar.bz2) || true
 		(test -f CONFIG/ARCHS/ARMv732NEON.tar.bz2.old && mv CONFIG/ARCHS/ARMv732NEON.tar.bz2.old CONFIG/ARCHS/ARMv732NEON.tar.bz2) || true
 
+# The ppc64el patches affect general powerpc files, and are work in progress.
+# So, for now, restrict them to the ppc64el builds. (debian/patches/ppc64el/)
+PATCH_SERIES_CURRENT := debian/patches/series
+PATCH_SERIES_PPC64EL := debian/patches/series.ppc64el
+PATCH_SERIES_ORIG    := debian/patches/series.ppc64el-orig
+
+patch-ppc64el:
+ifeq ($(DEB_HOST_ARCH),ppc64el)
+	dh_testdir
+	# Store original patch series, append ppc64el patches, and apply them.
+	if ! test -f $(PATCH_SERIES_ORIG); then \
+		cp -a $(PATCH_SERIES_CURRENT) $(PATCH_SERIES_ORIG); \
+		cat $(PATCH_SERIES_PPC64EL) >> $(PATCH_SERIES_CURRENT); \
+		while quilt --quiltrc /dev/null next | grep '^ppc64el/'; do \
+			quilt --quiltrc /dev/null push; \
+		done; \
+	fi
+endif
+
+unpatch-ppc64el:
+ifeq ($(DEB_HOST_ARCH),ppc64el)
+	dh_testdir
+	# Unapply ppc64el patches, and restore original patch series.
+	if test -f $(PATCH_SERIES_ORIG); then \
+		while quilt --quiltrc /dev/null top | grep '^ppc64el/'; do \
+			quilt --quiltrc /dev/null pop -R; \
+		done; \
+		mv $(PATCH_SERIES_ORIG) $(PATCH_SERIES_CURRENT); \
+	fi
+endif
+
 # Build a custom package optimized for the current arch
 custom: custom-stamp
 .PHONY: custom
@@ -122,7 +159,7 @@
 		touch $@
 
 common-configure-arch common-configure-indep:: configure-stamp
-configure-stamp:
+configure-stamp: patch-ppc64el
 		dh_testdir
 
 		set -e;											\
@@ -167,7 +204,7 @@
 		touch $@
 
 clean:: clean-work
-clean-work: restore-armhf-archdef
+clean-work: restore-armhf-archdef unpatch-ppc64el
 		dh_testdir
 		dh_testroot
 		rm -rf build check
diff -Nru atlas-3.10.2/debian/source/include-binaries atlas-3.10.2/debian/source/include-binaries
--- atlas-3.10.2/debian/source/include-binaries	2014-07-12 07:23:26.000000000 -0300
+++ atlas-3.10.2/debian/source/include-binaries	2014-10-24 19:45:37.000000000 -0200
@@ -6,5 +6,6 @@
 debian/archdefs/mips/GENERIC32.tar.bz2
 debian/archdefs/mipsel/GENERIC32.tar.bz2
 debian/archdefs/powerpc/GENERIC32.tar.bz2
+debian/archdefs/ppc64el/GENERIC64LE.tar.bz2
 debian/archdefs/s390x/IBMz964.tar.bz2
 debian/archdefs/sparc/USI32.tar.bz2


More information about the debian-science-maintainers mailing list