<div dir="ltr">Turns out the patch I submitted didn't fix the issue. Sorry about that. Please ignore the patch in my previous email.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jul 15, 2020 at 11:08 PM Francesc Zacarias <<a href="mailto:fzacarias3k@gmail.com">fzacarias3k@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi!<br><br>I'm experiencing the same issue as described in this bug with debian sid and mysql-server-5.7 (5.7.26-1+b1). I noticed that one out of three times my system boots, logrotate service fails with the following error:<br><br>```<br>Jul 09 07:12:01 baal systemd[1]: Starting Rotate log files...<br>Jul 09 07:12:01 baal logrotate[851]: error: error running shared postrotate script for '/var/log/mysql.log /var/log/mysql/*log><br>Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE<br>Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result 'exit-code'.<br>Jul 09 07:12:01 baal systemd[1]: Failed to start Rotate log files.<br>```<br><br>Looking at journalctl, I can see that every time this issue occurs, mysqld is started and logrotate fails *at the same time*:<br><br>```<br>Jul 09 07:12:01 baal mysqld[847]: 2020-07-09T06:12:01.283924Z 0 [Note] /usr/sbin/mysqld: ready for connections.<br>Jul 09 07:12:01 baal mysqld[847]: Version: '5.7.26-1+b1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Debian)<br>...<br>Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE<br>Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result 'exit-code'.<br>```<br><br>This and the fact that the issue does not reproduce every time makes me suspect that is a race-condition between logrotate and mysqld. Looking at the code in /etc/logrotate.d/mysql-server it seems like logrotate first checks if mysql is running before issuing the "mysqladmin flush-logs" command. That check is done by issuing "mysqladmin ping" (this also checks that /etc/mysql/debian.cnf is valid). If this check fails, then it will try to verify if the mysql process exists. If it does, then it will fail with an exit code = 1, which causes the error messages in my logs above.<br><div>I'm sure you already noticed the issue here: what if mysqld has not started yet by the time "mysqladmin ping" runs, but it's up right by the time killall is executed? Apparently, this happens quite often in my system.<br><br>To fix this, I suggest reverting the checks: first validate that mysqld is running. If not, then there is no reason to issue "mysqladmin flush-logs" and the script can finish successfully. If mysqld is running *and* "mysqladmin ping" fails, then it should definitely throw an error.</div><div><br></div><div>This approach is not completely free of race-conditions either, but it should be more rare. I'm attaching a patch with the solution I just described. I've had this patch applied to my system for a week and the issue has not been reproduced since. The *best* solution would be to simply attempt to apply "flush-logs" and react differently depending on the exit code. Unfortunately, it seems like mysqladmin does not return different exit codes on different errors (in my tests it always returns 1), so this approach is not possible at the moment.<br><br>Please, apply this patch and fix this issue!<br><br>Kind regards,</div><div>Francesc<br><br><br></div></div>
</blockquote></div>