[parted-devel] Possible Race Condition using test code, libparted, and Fedora 12

Curtis Gedak gedakc at gmail.com
Thu Jan 7 20:25:17 UTC 2010


Petr Uzel wrote:
> Hm, that doesn't look much better :)
>
> I'm experiencing similar (or maybe the same ?) problem on SUSE with
> parted-1.8.8. Sometimes, if I create a partition and immediately
> afterwards I try to remove the partition, the kernel doesn't get
> informed. The pseudoscript(tm) I'm using:
>
> ---
> #!/bin/bash
> parted -s /dev/sdX mkpart primary 0 10M
> parted -s /dev/sdX rm 1
> grep /dev/sdX1 /proc/partitions && report error
> ---
>
> If I run this in a cycle, after several iterations it fails,
> because sdX1 is deleted from on-disk table, but it is still
> present in /proc/partitions.
>
> It seems to be really sensitive to timing, because if I e.g.
> run the 'parted rm' via strace, the probability of failure
> decreases significantly.
>
> Idea: something must be touching /dev/sdX1 while parted is
> deleting it. So I've put 'lsof /dev/sdX1' between those two
> parted calls -> some hal related crap is touching /dev/sdX1.
>
> Now the interesting part: if I run the test with haldaemon
> disabled, I'm no longer able to reproduce it.
>
> I know this is far from precise analysis of a problem (and even
> further from a solution), but perhaps it might show something
> where to look.
>   

Using your hypothesis Petr, I ran my test scripts on Fedora 12 with the 
HAL daemon shut down.  Unfortunately the problem with "failure to inform 
kernel of partition changes" still occurred.

My conclusion from this is that the HAL daemon is not the cause of this 
problem.  Something else is creating this problem.

The results for this test on Fedora 12 can be seen at the following link:
https://bugzilla.gnome.org/show_bug.cgi?id=604298#c17


Of note is that I have run my tests on the System Rescue CD v1.3.4 and 
the problem does not occur.  SysRescCD is based on gentoo, so perhaps we 
can find some key difference here that will lead to a solution.
https://bugzilla.gnome.org/show_bug.cgi?id=604298#c16

It would help if we could narrow down our search for differences to a 
small list of packages.  With the intermittent nature of this bug, I 
suspect a problem with the udev package because it is responsible for 
handling the task of detecting hardware and creating nodes for the 
hardware in /dev.  From my reading, udev works in concert with HAL, so 
HAL may still play a role in this problem.  These suspicions I have are 
only a guess as to the real cause of the problem.

Following is a link to some information on udev:
http://www.enterprisenetworkingplanet.com/nethub/article.php/3635686

Regards,
Curtis Gedak



More information about the parted-devel mailing list