Bug#767138: fftw3: runtime detection of NEON is perhaps broken
Edmund Grimley Evans
edmund.grimley.evans at gmail.com
Tue Oct 28 17:13:37 UTC 2014
Source: fftw3
Version: 3.3.4-1.1
In simd-support/neon.c I found:
static int really_have_neon(void)
{
void (*oldsig)(int);
oldsig = signal(SIGILL, sighandler);
if (setjmp(jb)) {
signal(SIGILL, oldsig);
return 0;
} else {
/* paranoia: encode the instruction in binary because the
assembler may not recognize it without -mfpu=neon */
/*asm volatile ("vand q0, q0, q0");*/
asm volatile (".long 0xf2000150");
signal(SIGILL, oldsig);
return 1;
}
}
Of course, that binary encoding of the VAND instruction is only valid
for ARM mode, not Thumb, and the library is mostly compiled for Thumb,
I think.
In fact, I think I have tracked down where this code appears in the
binary. In libfftw3f.so.3.4.4 I found:
a9f84: 490f ldr r1, [pc, #60] ; (a9fc4
<fftwf_guru64_kosherp+0xa4>)
a9f86: 2004 movs r0, #4
a9f88: b500 push {lr}
a9f8a: 4479 add r1, pc
a9f8c: b083 sub sp, #12
a9f8e: f765 ed0c blx f9a8 <_init+0x33c>
a9f92: 9001 str r0, [sp, #4]
a9f94: 480c ldr r0, [pc, #48] ; (a9fc8
<fftwf_guru64_kosherp+0xa8>)
a9f96: 4478 add r0, pc
a9f98: f765 ec66 blx f868 <_init+0x1fc>
a9f9c: b948 cbnz r0, a9fb2 <fftwf_guru64_kosherp+0x92>
! a9f9e: 0150 lsls r0, r2, #5
! a9fa0: f200 2004 addw r0, r0, #516 ; 0x204
a9fa4: 9901 ldr r1, [sp, #4]
a9fa6: f765 ed00 blx f9a8 <_init+0x33c>
a9faa: 2001 movs r0, #1
a9fac: b003 add sp, #12
a9fae: f85d fb04 ldr.w pc, [sp], #4
a9fb2: 9901 ldr r1, [sp, #4]
a9fb4: 2004 movs r0, #4
a9fb6: f765 ecf8 blx f9a8 <_init+0x33c>
a9fba: 2000 movs r0, #0
a9fbc: b003 add sp, #12
a9fbe: f85d fb04 ldr.w pc, [sp], #4
This may explain some problems that people have experienced with
libfftw3:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=752514
http://lists.debian.org/debian-arm/2014/10/msg00051.html
Is this signal-handling approach the best way of detecting NEON? The
following blog suggests using HWCAP, but I don't know if that would
work with the freebsd kernels:
http://community.arm.com/groups/android-community/blog/2014/10/10/runtime-detection-of-cpu-features-on-an-armv8-a-cpu
More information about the debian-science-maintainers
mailing list