[debian-mysql] Bug#544013: mysql-server-5.1: logrotate script cannot handle stopped mysqld

Sun Sep 6 11:57:16 BST 2020

Turns out the patch I submitted didn't fix the issue. Sorry about that.
Please ignore the patch in my previous email.

On Wed, Jul 15, 2020 at 11:08 PM Francesc Zacarias <fzacarias3k at gmail.com>
wrote:

> Hi!
>
> I'm experiencing the same issue as described in this bug with debian sid
> and mysql-server-5.7 (5.7.26-1+b1). I noticed that one out of three times
> my system boots, logrotate service fails with the following error:
>
> ```
> Jul 09 07:12:01 baal systemd[1]: Starting Rotate log files...
> Jul 09 07:12:01 baal logrotate[851]: error: error running shared
> postrotate script for '/var/log/mysql.log /var/log/mysql/*log>
> Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited,
> code=exited, status=1/FAILURE
> Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result
> 'exit-code'.
> Jul 09 07:12:01 baal systemd[1]: Failed to start Rotate log files.
> ```
>
> Looking at journalctl, I can see that every time this issue occurs, mysqld
> is started and logrotate fails *at the same time*:
>
> ```
> Jul 09 07:12:01 baal mysqld[847]: 2020-07-09T06:12:01.283924Z 0 [Note]
> /usr/sbin/mysqld: ready for connections.
> Jul 09 07:12:01 baal mysqld[847]: Version: '5.7.26-1+b1'  socket:
> '/var/run/mysqld/mysqld.sock'  port: 3306  (Debian)
> ...
> Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited,
> code=exited, status=1/FAILURE
> Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result
> 'exit-code'.
> ```
>
> This and the fact that the issue does not reproduce every time makes me
> suspect that is a race-condition between logrotate and mysqld. Looking at
> the code in /etc/logrotate.d/mysql-server it seems like logrotate first
> checks if mysql is running before issuing the "mysqladmin flush-logs"
> command. That check is done by issuing "mysqladmin ping" (this also checks
> that /etc/mysql/debian.cnf is valid). If this check fails, then it will try
> to verify if the mysql process exists. If it does, then it will fail with
> an exit code = 1, which causes the error messages in my logs above.
> I'm sure you already noticed the issue here: what if mysqld has not
> started yet by the time "mysqladmin ping" runs, but it's up right by the
> time killall is executed? Apparently, this happens quite often in my system.
>
> To fix this, I suggest reverting the checks: first validate that mysqld is
> running. If not, then there is no reason to issue "mysqladmin flush-logs"
> and the script can finish successfully. If mysqld is running *and*
> "mysqladmin ping" fails, then it should definitely throw an error.
>
> This approach is not completely free of race-conditions either, but it
> should be more rare. I'm attaching a patch with the solution I just
> described. I've had this patch applied to my system for a week and the
> issue has not been reproduced since. The *best* solution would be to simply
> attempt to apply "flush-logs" and react differently depending on the exit
> code. Unfortunately, it seems like mysqladmin does not return different
> exit codes on different errors (in my tests it always returns 1), so this
> approach is not possible at the moment.
>
> Please, apply this patch and fix this issue!
>
> Kind regards,
> Francesc
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-mysql-maint/attachments/20200906/12465865/attachment.html>