[parted-devel] Possible Race Condition using test code, libparted, and Fedora 12

Petr Uzel petr.uzel at suse.cz
Wed Jan 20 08:28:07 UTC 2010


On Tue, Jan 19, 2010 at 01:26:32PM -0700, Curtis Gedak wrote:
> Petr Uzel wrote:
> >if this udev change is the cause of the resizing problem, then it is
> >different issue from the 'create partition-delete partition-
> >partition still in /proc/partitions' issue which I've described
> >earlier, because I can reproduce it on system with udev-128.
> 
> Hi Petr,

Hello,

> I think that it is still possible that both the resizing problem and
> the "create partition-delete partition-partition still in
> /proc/partitions" might be due to the same root problem.  The change
> to udev-138 might simply have increased the frequency of occurrence
> of the problem.

IMO the root problem in both cases is that while kernel is instructed
to modify its internal partition table, something is touching the
device and thus the ioctl fails. What might differ is what this thing
is and how it is triggered. In my case, I still believe it's hal as
shown by lsof.

> With the resizing problem when the "failure to inform kernel of
> partition changes" problem occurs, the entry in /proc/partitions is
> not updated.  Hence it will contain the old size for the partition
> instead of the new size.  This is similar to the problem you found
> with /proc/partitions not being updated (entries not being created or
> deleted).

Exactly.

> If I understand udev properly, it is responsible for creating,
> deleting, and updating devices in the /dev directory

Yes.

> and hence
> /proc/partitions too.

AFAIK no - at least for regular disks, it is not udev, but the kernel
itself who updates entries in /proc/partitions (triggered by BLKPG*
and BLKRRPART ioctls).

> Perhaps you could try your test with the potential patch I posted to
> see if it helps the situation?

I do my tests with parted-1.8.8 [*], which still uses BLKPG* ioctls to
inform the kernel and thus the different codepath than your patch
modifies.

Anyway, I'm thinking about similar patch that would retry the
BLKPG_DEL_PARTITION several times, eventually with a short
sleep before the last try (fixing race condition with sleeping
is not a most robust solution, but I can't think of anything better
now).

[*] I know this version is 'a bit' older, but it is my primary concern
now (distribution stuff :/ )

> My testing of the patch has been running for over 3 days and over
> 43,000 iterations.  So far no problems with "failure to inform kernel
> of partition changes" have occurred.  This is the largest number of
> iterations I have been able to run with my test prior to encountering
> a problem with the kernel re-reading the partition table.

I think your solution should work in most cases, however, the proper
fix would IMHO be to somehow gain exclusive access to the device to
prevent anything else from opening it. I don't know if it's doable,
though.

> I expect it will take at least another 3 or 4 days for the test to
> reach 99,999 iterations.  I will report back to this mailing list
> with the results of this test.

Regards,
Petr

--
Petr Uzel, openSUSE Boosters Team
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/parted-devel/attachments/20100120/5a08b2b5/attachment.pgp>


More information about the parted-devel mailing list