[debian-mysql] Bug#1006111: mariadb-server: wrong groupby result in newly filled myISAM table

Cameron Davidson bugs at davidsoncj.id.au
Sat Feb 19 09:06:19 GMT 2022


Package: mariadb-server
Version: 1:10.5.12-0+deb11u1
Severity: important

Dear Maintainer,

I have rated it as halfway between 2 and 6, because it could result in
wrong answers being reported, but only in what I presume are rare situations.

This bug is present in current release of both Buster and Bullseye.
I have replicated it on fresh minimal installs on guest VMs, created
with xen-tools.
If I replace the Debian Buster packages with  upstream packages for version 10.3.34, or 10.7.3
then the problem is fixed. I cannot find upstream packages for Bullseye.

I cannot see a bug report on jira.mariabdb.com that describes the problem.

To replicate, I have attached a collection of sql scripts - source them
using the mysql client:
1. eBird_subset.sql   - will create the test database and starting tables - it might
not make a lot of sense, partly because many columns and supporting
tables have been removed to create a sufficiently small reproducible
example.
2. replicate-failure.sql  - will create and populate the intermediate table that
causes the problem (sampling_subset), and show two sets of summary results that have been
grouped by month (column "mon") and county (column "IDCountyCode")

The expected answer is given by using an intermediate table and is:

+------+--------------+-------+------+-----------+----------+
| mon  | IDCountyCode | birds | nRep | reportPct | frac_pt  |
+------+--------------+-------+------+-----------+----------+
|    1 | AU-QLD-GYM   |    79 |   10 |   14.4928 | 669.4915 |
|    2 | AU-QLD-GYM   |    50 |    8 |   29.6296 | 324.6753 |
|    3 | AU-QLD-GYM   |   114 |    8 |   24.2424 | 389.0785 |
|    1 | AU-QLD-LOV   |   495 |   50 |   40.0000 | 272.1275 |
|    2 | AU-QLD-LOV   |   145 |    9 |   27.2727 | 394.0217 |
|    3 | AU-QLD-LOV   |   161 |   12 |   28.5714 | 242.1053 |
|    1 | AU-QLD-RED   |    79 |   32 |   20.3822 | 312.2530 |
|    2 | AU-QLD-RED   |   122 |   32 |   21.6216 | 539.8230 |
|    3 | AU-QLD-RED   |    53 |   20 |   13.3333 | 212.8514 |
|    1 | AU-QLD-SOM   |   759 |   45 |   36.2903 | 213.0227 |
|    2 | AU-QLD-SOM   |  1570 |   41 |   40.1961 | 128.2470 |
|    3 | AU-QLD-SOM   |   645 |   23 |   29.4872 | 169.8710 |
+------+--------------+-------+------+-----------+----------+
12 rows in set (0.001 sec)

The wrong answer from query B, with no intermediate table, is:
+------+--------------+-------+------+-----------+------------+
| mon  | IDCountyCode | birds | nRep | reportPct | frac_pt    |
+------+--------------+-------+------+-----------+------------+
|    1 | AU-QLD-GYM   |  1412 |  137 |  198.5507 | 11966.1017 |
|    2 | AU-QLD-GYM   |  1887 |   90 |  333.3333 | 12253.2468 |
|    3 | AU-QLD-GYM   |   973 |   63 |  190.9091 |  3320.8191 |
+------+--------------+-------+------+-----------+------------+
3 rows in set (0.024 sec)

The first query can be repeated by sourcing check-A-2way.sql, and the
second by check-B-2way.sql

To "fix" the problem, simply check the table, 
  CHECK TABLE sampling_subset;
which reports no errors, but the query B now produces the expected 12-row
table.

In the full dataset there is another symptom - reporting 12 months for
11 counties for one species takes about 50 seconds when query B gives
the wrong answer, and about 0.2 seconds after the table check fixes the
problem.

Furthermore, if I change the table to InnoDB before populating it, there is no
problem. Since the nature of the table is write-once and read many
times, I took myISAM to be the more suitable engine.



-- System Information:
Debian Release: 10.11
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-18-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8), LANGUAGE=en_AU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug-deb-mariadb.tgz
Type: application/gzip
Size: 1210070 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-mysql-maint/attachments/20220219/d80e69de/attachment-0001.gz>


More information about the pkg-mysql-maint mailing list