[Debian-med-packaging] Bug#1086108: scipy: endless loop in expm on riscv64 due to UB float to int conversion

Aurelien Jarno aurel32 at debian.org
Sat Oct 26 20:21:20 BST 2024


Source: scipy
Version: 1.13.1-5
Severity: important
Tags: ftbfs upstream
X-Debbugs-Cc: debian-med-packaging at lists.alioth.debian.org, debian-riscv at lists.debian.org
User: debian-riscv at lists.debian.org
Usertags: riscv64
Control: affects -1 + src:nipy
Control: found -1 scipy/1.14.0-1exp5

Dear maintainer,

I have been debugging the cause of the FTBFS of nipy on riscv64. It
fails with [1]:

| nipy/core/reference/tests/test_matrix_groups.py .
| E: Build killed with signal TERM after 600 minutes of inactivity

I have been able to extract a minimal reproducer calling the SciPy expm
function:

| #!/usr/bin/python3
| 
| import numpy as np
| from scipy.linalg import expm
| 
| Z = np.array([[-0.83555296,  1.23536117, -0.54084919],
|               [ 0.48341885, -0.55882754, -0.53693891],
|               [-0.14802191, -0.43249490,  0.53730155]])
| # Z = np.random.standard_normal((3,3))
| 
| orth = expm(Z - Z.T)

The original code uses np.random.standard_normal but I changed that into
a static array to always use to the same code path. Debugging further,
it happens that the pick_pade_structure() in the Cython expm support
code is relying on undefined behaviour (UB) to convert a non-finite
floating point value to int in a few places [2]. According to the C++
standard this is an undefined behaviour:

| When a finite value of real floating type is converted to an integer
| type other than _Bool, the fractional part is discarded (i.e., the
| value is truncated toward zero). If the value of the integral part
| cannot be represented by the integer type, the behavior is undefined.

On x86, the conversion returns 0, but on RISC-V it returns either
INT_MIN or INT_MAX depending on the sign of the floating point value. As
the result of this conversion is then used in a loop [3], this takes an
eternity to execute and just appears as endless.

The problem has been fixed upstream as part of the rewrite of the Cython
code into C [4][5]. For what I understand this will go into SciPy 1.15.

I am not sure how you want to get that fixed in the Debian package. I
see two options:
- Backport the upstream commits [4][5]
- Change the Cython code to check if number are finite before conversion
  to int, and if not return 0.

Please tell me what do you prefer, and I will work on the corresponding
patch.

Note that the build failures on mips64el and sparc64 are very likely to
be same issue.

Regards
Aurelien

[1] https://buildd.debian.org/status/fetch.php?pkg=nipy&arch=riscv64&ver=0.6.1-1&stamp=1729595150&raw=0
[2] https://sources.debian.org/src/scipy/1.13.1-5/scipy/linalg/_matfuncs_expm.pyx.in/#L224
[3] https://sources.debian.org/src/scipy/1.13.1-5/scipy/linalg/_matfuncs.py/#L345
[4] https://github.com/scipy/scipy/pull/21553
[5] https://github.com/scipy/scipy/commit/424708ed018cae6b6584d7d992940fd39f2ebcc0
[6] https://github.com/scipy/scipy/commit/84d4bd91213a89cc59d67dd9045625055c2cc463



More information about the Debian-med-packaging mailing list