[Debian-med-packaging] Bug#1086108: scipy: endless loop in expm on riscv64 due to UB float to int conversion
Aurelien Jarno
aurel32 at debian.org
Sat Oct 26 20:21:20 BST 2024
Source: scipy
Version: 1.13.1-5
Severity: important
Tags: ftbfs upstream
X-Debbugs-Cc: debian-med-packaging at lists.alioth.debian.org, debian-riscv at lists.debian.org
User: debian-riscv at lists.debian.org
Usertags: riscv64
Control: affects -1 + src:nipy
Control: found -1 scipy/1.14.0-1exp5
Dear maintainer,
I have been debugging the cause of the FTBFS of nipy on riscv64. It
fails with [1]:
| nipy/core/reference/tests/test_matrix_groups.py .
| E: Build killed with signal TERM after 600 minutes of inactivity
I have been able to extract a minimal reproducer calling the SciPy expm
function:
| #!/usr/bin/python3
|
| import numpy as np
| from scipy.linalg import expm
|
| Z = np.array([[-0.83555296, 1.23536117, -0.54084919],
| [ 0.48341885, -0.55882754, -0.53693891],
| [-0.14802191, -0.43249490, 0.53730155]])
| # Z = np.random.standard_normal((3,3))
|
| orth = expm(Z - Z.T)
The original code uses np.random.standard_normal but I changed that into
a static array to always use to the same code path. Debugging further,
it happens that the pick_pade_structure() in the Cython expm support
code is relying on undefined behaviour (UB) to convert a non-finite
floating point value to int in a few places [2]. According to the C++
standard this is an undefined behaviour:
| When a finite value of real floating type is converted to an integer
| type other than _Bool, the fractional part is discarded (i.e., the
| value is truncated toward zero). If the value of the integral part
| cannot be represented by the integer type, the behavior is undefined.
On x86, the conversion returns 0, but on RISC-V it returns either
INT_MIN or INT_MAX depending on the sign of the floating point value. As
the result of this conversion is then used in a loop [3], this takes an
eternity to execute and just appears as endless.
The problem has been fixed upstream as part of the rewrite of the Cython
code into C [4][5]. For what I understand this will go into SciPy 1.15.
I am not sure how you want to get that fixed in the Debian package. I
see two options:
- Backport the upstream commits [4][5]
- Change the Cython code to check if number are finite before conversion
to int, and if not return 0.
Please tell me what do you prefer, and I will work on the corresponding
patch.
Note that the build failures on mips64el and sparc64 are very likely to
be same issue.
Regards
Aurelien
[1] https://buildd.debian.org/status/fetch.php?pkg=nipy&arch=riscv64&ver=0.6.1-1&stamp=1729595150&raw=0
[2] https://sources.debian.org/src/scipy/1.13.1-5/scipy/linalg/_matfuncs_expm.pyx.in/#L224
[3] https://sources.debian.org/src/scipy/1.13.1-5/scipy/linalg/_matfuncs.py/#L345
[4] https://github.com/scipy/scipy/pull/21553
[5] https://github.com/scipy/scipy/commit/424708ed018cae6b6584d7d992940fd39f2ebcc0
[6] https://github.com/scipy/scipy/commit/84d4bd91213a89cc59d67dd9045625055c2cc463
More information about the Debian-med-packaging
mailing list