Bug#959366: python3-seaborn: Allow fallback to scipy when statsmodels is present and issues runtime errors

Étienne Mollier etienne.mollier at mailoo.org
Fri May 1 14:41:46 BST 2020


Package: python3-seaborn
Version: 0.10.0-1
Severity: important
Tags: patch upstream

Dear Maintainers,

With certain datasets having a particular distribution, the
seaborn library crashes with a runtime error, issued first by
the python3-statsmodels library.  statsmodels seems to expect
the caller to trap the error and adapt its logic, but seaborn
as available currently in Sid does not do so.

There is a patch upstream that fixes this issue, available at
the following location:

	https://github.com/mwaskom/seaborn/commit/09fef026ad89a299e13db44fa5b92885fb5b2823

This patch is part of seaborn 0.10.1, so getting this fix
brought to Sid will just be a matter of upgrading to that
version or later, hopefuly.


This issue is preventing the autopkgtest base of NanoPlot, which
is currently being packaged by the Debian Med Team, to run
properly, hence the severity set to "important":

	https://salsa.debian.org/med-team/nanoplot

The issue can be reproduced with building the nanoplot package,
building the package python3-nanoget-examples which stores the
test data, installing both on the system, and running the test.
Here is the relevant part of the test output for reference:

	Traceback (most recent call last):
	  File "/usr/lib/python3/dist-packages/statsmodels/nonparametric/kde.py", line 451, in kdensityfft
	    bw = float(bw)
	ValueError: could not convert string to float: 'scott'
	
	During handling of the above exception, another exception occurred:
	
	Traceback (most recent call last):
	  File "/usr/bin/NanoPlot", line 11, in <module>
	    load_entry_point('NanoPlot==1.29.0', 'console_scripts', 'NanoPlot')()
	  File "/usr/lib/python3/dist-packages/nanoplot/NanoPlot.py", line 96, in main
	    plots = make_plots(datadf, settings)
	  File "/usr/lib/python3/dist-packages/nanoplot/NanoPlot.py", line 227, in make_plots
	    nanoplotter.scatter(
	  File "/usr/lib/python3/dist-packages/nanoplotter/nanoplotter_main.py", line 193, in scatter
	    plot = sns.jointplot(
	  File "/usr/lib/python3/dist-packages/seaborn/axisgrid.py", line 2338, in jointplot
	    grid.plot_marginals(kdeplot, **marginal_kws)
	  File "/usr/lib/python3/dist-packages/seaborn/axisgrid.py", line 1823, in plot_marginals
	    func(self.x, **kwargs)
	  File "/usr/lib/python3/dist-packages/seaborn/distributions.py", line 703, in kdeplot
	    ax = _univariate_kdeplot(data, shade, vertical, kernel, bw,
	  File "/usr/lib/python3/dist-packages/seaborn/distributions.py", line 293, in _univariate_kdeplot
	    x, y = _statsmodels_univariate_kde(data, kernel, bw,
	  File "/usr/lib/python3/dist-packages/seaborn/distributions.py", line 367, in _statsmodels_univariate_kde
	    kde.fit(kernel, bw, fft, gridsize=gridsize, cut=cut, clip=clip)
	  File "/usr/lib/python3/dist-packages/statsmodels/nonparametric/kde.py", line 138, in fit
	    density, grid, bw = kdensityfft(endog, kernel=kernel, bw=bw,
	  File "/usr/lib/python3/dist-packages/statsmodels/nonparametric/kde.py", line 453, in kdensityfft
	    bw = bandwidths.select_bandwidth(X, bw, kern) # will cross-val fit this pattern?
	  File "/usr/lib/python3/dist-packages/statsmodels/nonparametric/bandwidths.py", line 174, in select_bandwidth
	    raise RuntimeError(err)
	RuntimeError: Selected KDE bandwidth is 0. Cannot estimate density.

Manually patching the file seaborn/distributions.py with
upstream's approach to solving the problem allowed me to go
through the autopkgtest suite of NanoPlot.

Kind Regards,
Étienne.


-- System Information:
Debian Release: bullseye/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386, riscv64

Kernel: Linux 5.5.6 (SMP w/4 CPU cores; PREEMPT)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages python3-seaborn depends on:
ii  python3             3.8.2-3
ii  python3-matplotlib  3.2.1-1+b1
ii  python3-numpy       1:1.18.3-1
ii  python3-pandas      0.25.3+dfsg-9
ii  python3-scipy       1.4.1-2
ii  python3-tk          3.8.2-2

Versions of packages python3-seaborn recommends:
ii  python3-bs4    4.9.0-2
ii  python3-patsy  0.5.1-1

python3-seaborn suggests no packages.

Edited to add the package affecting the behavior of
python3-seaborn, while not being referenced in its metadata:
ii  python3-statsmodels  0.11.1-2

-- no debconf information



More information about the debian-science-maintainers mailing list