[Debian-astro-maintainers] Bug#902990: FTBFS: bus error, alignment problems

Steve McIntyre steve at einval.com
Wed Jul 4 16:56:30 BST 2018


Source: python-fitsio
Version: 0.9.11+dfsg-1
Severity: serious
Justification: fails to build from source (but built successfully in the past)

Hi!

First seen on the build arm-arm-01 which is an arm64 machine
configured to build for armhf, build log at

https://buildd.debian.org/status/fetch.php?pkg=python-fitsio&arch=armhf&ver=0.9.11%2Bdfsg-1%2Bb2&stamp=1530604224&raw=0

I can similarly reproduce this on other arm64 hardware. Backtrace
shows:

(gdb) bt
#0  ffi8fi8 (input=input at entry=0x16568df, ntodo=ntodo at entry=1, scale=1, zero=0, output=output at entry=0xfff3d1a0, status=status at entry=0xfff44384) at putcolj.c:1871
#1  0xf6cc5a8a in ffpcljj (fptr=0x15d3660, colnum=8, firstrow=<optimized out>, firstelem=1, nelem=1, array=0x16568df, status=0xfff44384) at putcolj.c:1428
#2  0xf6cc5c3c in ffpcljj (fptr=<optimized out>, colnum=<optimized out>, firstrow=<optimized out>, firstelem=firstelem at entry=1, nelem=nelem at entry=1, array=array at entry=0x16568df, 
    status=status at entry=0xfff44384) at putcolj.c:1538
#3  0xf6cbc2b4 in ffpcl (fptr=<optimized out>, datatype=<optimized out>, colnum=<optimized out>, firstrow=1, firstelem=1, nelem=1, array=0x16568df, status=0xfff44384) at putcol.c:701
#4  0xf6e02a26 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) p input
$2 = (LONGLONG *) 0x16568df

Disassembly of the problem area gives:

   0xf6cc54ce <+206>:   ldr.w   r0, [r10]
   0xf6cc54d2 <+210>:   ldmia.w sp!, {r4, r5, r6, r7, r8, r10, r11, pc}
   0xf6cc54d6 <+214>:   cmp     r1, #0
   0xf6cc54d8 <+216>:   ble.n   0xf6cc54ca <ffi8fi8+202>
   0xf6cc54da <+218>:   subs    r2, #8
   0xf6cc54dc <+220>:   add.w   r1, r0, r1, lsl #3
=> 0xf6cc54e0 <+224>:   ldrd    r4, r5, [r0], #8
   0xf6cc54e4 <+228>:   cmp     r1, r0
   0xf6cc54e6 <+230>:   strd    r4, r5, [r2, #8]!
   0xf6cc54ea <+234>:   bne.n   0xf6cc54e0 <ffi8fi8+224>
   0xf6cc54ec <+236>:   vpop    {d8-d12}
   0xf6cc54f0 <+240>:   ldr.w   r0, [r10]

The offending ldrd instruction here clearly matches up with the
source, which is the simple-case loop in ffi8fi8():

    if (scale == 1. && zero == 0.)
    {       
        for (ii = 0; ii < ntodo; ii++)
                output[ii] = input[ii];
    }

The ldrd instruction will be the read from input[ii]. That's clearly
unaligned, given the LONGLONG *input which starts at 0x16568df.

This doesn't look like an issue with the compiler coalescing unrelated
values (which was an initial guess at the problem when discussing it
in #debian-buildd). It's simply reading an unaligned 64-bit value and
that's failing.

-- System Information:
Debian Release: 9.4
  APT prefers stable-debug
  APT policy: (500, 'stable-debug'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.9.0-6-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)



More information about the Debian-astro-maintainers mailing list