Bug#777735: systemd: Disk corruption on reboot initiated through systemd

Peter Colberg peter at colberg.org
Thu Feb 12 05:22:02 GMT 2015


On Thu, Feb 12, 2015 at 04:28:38AM +0100, Michael Biebl wrote:
> Just to be clear here: Is this only happening once after the
> dist-upgrade from wheezy (sysvinit) to jessie (systemd-sysv) on reboot
> or everytime you reboot your jessie system?

It happens everytime I reboot from within the VM.

> What makes you think this is a disk corruption? The log message you
> provided shows that the disks are clean?

Yes, this is not disk corruption. The fsck at bootup is clean.

This is more acurately described as data loss. I see it in the form of
log data produced shortly before the reboot (that is after reboot is
issued and before the machine is reset) not being written to disk.
Instead, the log files contain variably-sized holes of nil bytes.

> Which other file? What file system is this?

So far I noticed corruption in /var/log/syslog and /var/log/daemon.log
which are written by rsyslog, and in /var/log/prosody/prosody.log
which is written by prosody.

Each VM has attached a single virtio disk with a GPT partition label
that contains a swap partition, an ext4 filesystem for /, and further
ext4 filesystems for /home and /srv.

> Can you share such a (gzipped) qemu-kvm image or provide instructions,
> how we can reproduce the problem?

Sorry, I cannot share the machine images since they contain private
data. However, I have now disabled any non-essential services on one
machine. Please see the file 'ps_aux' that shows all processes running
when logged in via ssh. I can still reproduce the corruption with this
mimimal configuration.

> Since systemd-logind reacts on external ACPI power button presses with
> the same command (systemctl poweroff), I'm suprised that this is
> supposed to make a difference.

I have narrowed down the issue significantly: corruption only occurs if
qemu-kvm emulates a watchdog device. I am using libvirt, please see
the file 'himeji.xml' for the machine setup.

When I remove the watchdog device from the VM, the corruption
disappears and /var/log/syslog contains the expected last and
first messages before and after the reboot:

Feb 11 23:41:19 himeji rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="532" x-info="http://www.rsyslog.com"] exiting on signal 15.
Feb 11 23:41:57 himeji rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="518" x-info="http://www.rsyslog.com"] start

> Can you follow the steps at [1] and provide a verbose debug log from
> boot and poweroff by logging to the serial console.
> 
> 
> [1] http://freedesktop.org/wiki/Software/systemd/Debugging/

I captured a reboot of the VM with and without a watchdog device. Please note
that reboots in a VM with watchdog device were working fine with sysvinit.

Does systemd always use /dev/watchdog? /etc/systemd/system.conf was left
unmodified, in particular it does not define RuntimeWatchdogSec.

Regards,
Peter
-------------- next part --------------
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.6  0.1  28460  4548 ?        Ss   23:45   0:01 /sbin/init
root         2  0.0  0.0      0     0 ?        S    23:45   0:00 [kthreadd]
root         3  0.3  0.0      0     0 ?        S    23:45   0:00 [ksoftirqd/0]
root         4  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/0:0]
root         5  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/0:0H]
root         6  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/u8:0]
root         7  0.3  0.0      0     0 ?        S    23:45   0:00 [rcu_sched]
root         8  0.0  0.0      0     0 ?        S    23:45   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        S    23:45   0:00 [migration/0]
root        10  0.0  0.0      0     0 ?        S    23:45   0:00 [watchdog/0]
root        11  0.0  0.0      0     0 ?        S    23:45   0:00 [watchdog/1]
root        12  0.0  0.0      0     0 ?        S    23:45   0:00 [migration/1]
root        13  0.4  0.0      0     0 ?        S    23:45   0:00 [ksoftirqd/1]
root        14  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/1:0]
root        15  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/1:0H]
root        16  0.0  0.0      0     0 ?        S    23:45   0:00 [watchdog/2]
root        17  0.0  0.0      0     0 ?        S    23:45   0:00 [migration/2]
root        18  0.3  0.0      0     0 ?        S    23:45   0:00 [ksoftirqd/2]
root        19  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/2:0]
root        20  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/2:0H]
root        21  0.0  0.0      0     0 ?        S    23:45   0:00 [watchdog/3]
root        22  0.0  0.0      0     0 ?        S    23:45   0:00 [migration/3]
root        23  0.3  0.0      0     0 ?        S    23:45   0:00 [ksoftirqd/3]
root        24  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/3:0]
root        25  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/3:0H]
root        26  0.0  0.0      0     0 ?        S<   23:45   0:00 [khelper]
root        27  0.0  0.0      0     0 ?        S    23:45   0:00 [kdevtmpfs]
root        28  0.0  0.0      0     0 ?        S<   23:45   0:00 [netns]
root        29  0.0  0.0      0     0 ?        S    23:45   0:00 [khungtaskd]
root        30  0.0  0.0      0     0 ?        S<   23:45   0:00 [writeback]
root        31  0.0  0.0      0     0 ?        SN   23:45   0:00 [ksmd]
root        32  0.0  0.0      0     0 ?        SN   23:45   0:00 [khugepaged]
root        33  0.0  0.0      0     0 ?        S<   23:45   0:00 [crypto]
root        34  0.0  0.0      0     0 ?        S<   23:45   0:00 [kintegrityd]
root        35  0.0  0.0      0     0 ?        S<   23:45   0:00 [bioset]
root        36  0.0  0.0      0     0 ?        S<   23:45   0:00 [kblockd]
root        37  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/0:1]
root        38  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/1:1]
root        39  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/2:1]
root        40  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/3:1]
root        41  0.0  0.0      0     0 ?        S    23:45   0:00 [kswapd0]
root        42  0.0  0.0      0     0 ?        S    23:45   0:00 [fsnotify_mark]
root        48  0.0  0.0      0     0 ?        S<   23:45   0:00 [kthrotld]
root        49  0.0  0.0      0     0 ?        S<   23:45   0:00 [ipv6_addrconf]
root        50  0.0  0.0      0     0 ?        S<   23:45   0:00 [deferwq]
root        51  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/u8:1]
root        90  0.0  0.0      0     0 ?        S<   23:45   0:00 [ata_sff]
root        92  0.0  0.0      0     0 ?        S    23:45   0:00 [scsi_eh_0]
root        93  0.0  0.0      0     0 ?        S<   23:45   0:00 [scsi_tmf_0]
root        94  0.0  0.0      0     0 ?        S    23:45   0:00 [scsi_eh_1]
root        95  0.0  0.0      0     0 ?        S<   23:45   0:00 [scsi_tmf_1]
root        96  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/u8:2]
root        97  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/u8:3]
root       109  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/3:1H]
root       110  0.0  0.0      0     0 ?        S    23:45   0:00 [jbd2/vda3-8]
root       111  0.0  0.0      0     0 ?        S<   23:45   0:00 [ext4-rsv-conver]
root       139  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/0:2]
root       147  0.0  0.0  32956  2980 ?        Ss   23:45   0:00 /lib/systemd/systemd-journald
root       148  0.0  0.0      0     0 ?        S    23:45   0:00 [kauditd]
root       160  0.0  0.0  40984  3444 ?        Ss   23:45   0:00 /lib/systemd/systemd-udevd
root       196  0.0  0.0      0     0 ?        S<   23:45   0:00 [kpsmoused]
root       232  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/2:2]
root       240  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/0:1H]
root       241  0.0  0.0      0     0 ?        S<   23:45   0:00 [kworker/2:1H]
root       247  0.0  0.0      0     0 ?        S    23:45   0:00 [jbd2/vda4-8]
root       248  0.0  0.0      0     0 ?        S<   23:45   0:00 [ext4-rsv-conver]
root       250  0.0  0.0      0     0 ?        S    23:45   0:00 [jbd2/vda5-8]
root       251  0.0  0.0      0     0 ?        S<   23:45   0:00 [ext4-rsv-conver]
root       269  0.0  0.0      0     0 ?        S    23:45   0:00 [kworker/3:2]
root       530  0.0  0.1  55160  5144 ?        Ss   23:45   0:00 /usr/sbin/sshd -D
root       532  0.0  0.0  28260  2920 ?        Ss   23:45   0:00 /lib/systemd/systemd-logind
message+   533  0.0  0.0  42112  3348 ?        Ss   23:45   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       535  0.0  0.0 262876  3600 ?        Ssl  23:45   0:00 /usr/sbin/rsyslogd -n
root       538  0.0  0.0  12656  1824 ttyS0    Ss+  23:45   0:00 /sbin/agetty --keep-baud 115200 38400 9600 ttyS0 vt102
root       545  0.0  0.1  91156  6156 ?        Ss   23:47   0:00 sshd: root at pts/0    
root       547  0.0  0.0  27220  3488 ?        Ss   23:47   0:00 /lib/systemd/systemd --user
root       548  0.0  0.0  49740  1500 ?        S    23:47   0:00 (sd-pam)  
root       550  0.0  0.1  23716  5632 pts/0    Ss   23:47   0:00 -bash
root       566  0.0  0.0  17812  2320 pts/0    R+   23:48   0:00 ps aux
-------------- next part --------------
A non-text attachment was scrubbed...
Name: himeji.xml
Type: application/xml
Size: 1985 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20150212/14712ea6/attachment-0002.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: himeji_reboot_with_watchdog.gz
Type: application/gzip
Size: 20959 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20150212/14712ea6/attachment.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: himeji_reboot_without_watchdog.gz
Type: application/gzip
Size: 20924 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20150212/14712ea6/attachment-0001.gz>


More information about the Pkg-systemd-maintainers mailing list