[parted-devel] Possible Race Condition using test code, libparted, and Fedora 12

Jim Meyering jim at meyering.net
Fri Jan 29 08:43:58 UTC 2010


Curtis Gedak wrote:

> Hi Petr,
>
> Petr Uzel wrote:
>> On Wed, Jan 27, 2010 at 01:51:11PM -0700, Curtis Gedak wrote:
>>
>>> diff --git a/libparted/arch/linux.c b/libparted/arch/linux.c
>>> index aefe788..1147b1e 100644
>>> --- a/libparted/arch/linux.c
>>> +++ b/libparted/arch/linux.c
>>> @@ -2518,12 +2518,14 @@ static int
>>> _kernel_reread_part_table (PedDevice* dev)
>>> {
>>>         LinuxSpecific*  arch_specific = LINUX_SPECIFIC (dev);
>>> -        int             retry_count = 5;
>>> +        int             retry_count = 9;
>>>
>>>         sync();
>>>         while (ioctl (arch_specific->fd, BLKRRPART)) {
>>>                 retry_count--;
>>>                 sync();
>>> +                if (retry_count == 3)
>>>
>>
>>                    ^^^^^^^^^^^^^^^^^^^^^
>> Don't you mean something like 'if (retry_count % 3 == 0)' instead?
>> That would pause on the 3rd, 6th and 9th (last) iteration.
>>
>
> My original patch did try to do something like you suggest.  After
> analyzing the patch, I realized that it only paused once in all of the
> 10 iterations.  This single pause was sufficient to catch all of the
> "failed to inform kernel of partition changes" problems that occurred
> during my testing.
>
> Hence this time I did intend that the code only pause once for 1
> second.  :-)
>
> My thoughts are that 1 second is a very long time for a computer.  If

If one second is always enough for you, then
1/10s or 1/2s may well be enough in many cases.
How about sleeping something like 200ms multiple times before giving up?
You can use gnulib's usleep function to do that portably.

We'd rather wait a little longer so as not to let a load- or
scheduling-induced delay cause parted to misbehave.

If you're interested, you'd add usleep to the list of modules
in bootstrap.conf and then rerun ./bootstrap && ./configure, etc.

> more than an single 1 second pause is used, then in my opinion this
> would unnecessarily increase the amount of time required until the
> call would finally fail.  For example this function call would still
> fail when used on a device with at least one partition mounted.  I did
> not want this to creep up to 2 or 3 seconds without significant
> benefits to the user.
>
> In many ways this patch is a balancing act between having the system
> do the right thing, and not making a user wait unnecessarily.



More information about the parted-devel mailing list