[debian-mysql] Bug#544013: mysql-server-5.1: logrotate script cannot handle stopped mysqld

Wed Jul 15 23:08:24 BST 2020

Hi!

I'm experiencing the same issue as described in this bug with debian sid
and mysql-server-5.7 (5.7.26-1+b1). I noticed that one out of three times
my system boots, logrotate service fails with the following error:

```
Jul 09 07:12:01 baal systemd[1]: Starting Rotate log files...
Jul 09 07:12:01 baal logrotate[851]: error: error running shared postrotate
script for '/var/log/mysql.log /var/log/mysql/*log>
Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited,
code=exited, status=1/FAILURE
Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result
'exit-code'.
Jul 09 07:12:01 baal systemd[1]: Failed to start Rotate log files.
```

Looking at journalctl, I can see that every time this issue occurs, mysqld
is started and logrotate fails *at the same time*:

```
Jul 09 07:12:01 baal mysqld[847]: 2020-07-09T06:12:01.283924Z 0 [Note]
/usr/sbin/mysqld: ready for connections.
Jul 09 07:12:01 baal mysqld[847]: Version: '5.7.26-1+b1'  socket:
'/var/run/mysqld/mysqld.sock'  port: 3306  (Debian)
...
Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited,
code=exited, status=1/FAILURE
Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result
'exit-code'.
```

This and the fact that the issue does not reproduce every time makes me
suspect that is a race-condition between logrotate and mysqld. Looking at
the code in /etc/logrotate.d/mysql-server it seems like logrotate first
checks if mysql is running before issuing the "mysqladmin flush-logs"
command. That check is done by issuing "mysqladmin ping" (this also checks
that /etc/mysql/debian.cnf is valid). If this check fails, then it will try
to verify if the mysql process exists. If it does, then it will fail with
an exit code = 1, which causes the error messages in my logs above.
I'm sure you already noticed the issue here: what if mysqld has not started
yet by the time "mysqladmin ping" runs, but it's up right by the time
killall is executed? Apparently, this happens quite often in my system.

To fix this, I suggest reverting the checks: first validate that mysqld is
running. If not, then there is no reason to issue "mysqladmin flush-logs"
and the script can finish successfully. If mysqld is running *and*
"mysqladmin ping" fails, then it should definitely throw an error.

This approach is not completely free of race-conditions either, but it
should be more rare. I'm attaching a patch with the solution I just
described. I've had this patch applied to my system for a week and the
issue has not been reproduced since. The *best* solution would be to simply
attempt to apply "flush-logs" and react differently depending on the exit
code. Unfortunately, it seems like mysqladmin does not return different
exit codes on different errors (in my tests it always returns 1), so this
approach is not possible at the moment.

Please, apply this patch and fix this issue!

Kind regards,
Francesc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-mysql-maint/attachments/20200715/c8a28b81/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logrotate-mysql-server.patch
Type: text/x-patch
Size: 703 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-mysql-maint/attachments/20200715/c8a28b81/attachment.bin>