[parted-devel] Possible Race Condition using test code, libparted, and Fedora 12
Curtis Gedak
gedakc at gmail.com
Thu Jan 7 20:25:17 UTC 2010
Petr Uzel wrote:
> Hm, that doesn't look much better :)
>
> I'm experiencing similar (or maybe the same ?) problem on SUSE with
> parted-1.8.8. Sometimes, if I create a partition and immediately
> afterwards I try to remove the partition, the kernel doesn't get
> informed. The pseudoscript(tm) I'm using:
>
> ---
> #!/bin/bash
> parted -s /dev/sdX mkpart primary 0 10M
> parted -s /dev/sdX rm 1
> grep /dev/sdX1 /proc/partitions && report error
> ---
>
> If I run this in a cycle, after several iterations it fails,
> because sdX1 is deleted from on-disk table, but it is still
> present in /proc/partitions.
>
> It seems to be really sensitive to timing, because if I e.g.
> run the 'parted rm' via strace, the probability of failure
> decreases significantly.
>
> Idea: something must be touching /dev/sdX1 while parted is
> deleting it. So I've put 'lsof /dev/sdX1' between those two
> parted calls -> some hal related crap is touching /dev/sdX1.
>
> Now the interesting part: if I run the test with haldaemon
> disabled, I'm no longer able to reproduce it.
>
> I know this is far from precise analysis of a problem (and even
> further from a solution), but perhaps it might show something
> where to look.
>
Using your hypothesis Petr, I ran my test scripts on Fedora 12 with the
HAL daemon shut down. Unfortunately the problem with "failure to inform
kernel of partition changes" still occurred.
My conclusion from this is that the HAL daemon is not the cause of this
problem. Something else is creating this problem.
The results for this test on Fedora 12 can be seen at the following link:
https://bugzilla.gnome.org/show_bug.cgi?id=604298#c17
Of note is that I have run my tests on the System Rescue CD v1.3.4 and
the problem does not occur. SysRescCD is based on gentoo, so perhaps we
can find some key difference here that will lead to a solution.
https://bugzilla.gnome.org/show_bug.cgi?id=604298#c16
It would help if we could narrow down our search for differences to a
small list of packages. With the intermittent nature of this bug, I
suspect a problem with the udev package because it is responsible for
handling the task of detecting hardware and creating nodes for the
hardware in /dev. From my reading, udev works in concert with HAL, so
HAL may still play a role in this problem. These suspicions I have are
only a guess as to the real cause of the problem.
Following is a link to some information on udev:
http://www.enterprisenetworkingplanet.com/nethub/article.php/3635686
Regards,
Curtis Gedak
More information about the parted-devel
mailing list