[parted-devel] [PATCH] arch: Fix race between systemd and parted command
Brian C. Lane
bcl at redhat.com
Thu Dec 19 18:01:13 GMT 2024
On Tue, Dec 17, 2024 at 08:43:46PM +0000, Gulam Mohamed wrote:
> Here are the answers to your questions:
>
> - how many and how much delay? I think there's going to be situations
> where things may still fail. But you also don't want really long
> delays if there really is a missing device.
>
> [GULAM]: Yes, keeping in mind the fact that if the device is really
> missing, I kept the delay of around 10ms with around 100 iterations.
> Can you please suggest?
1s of total delay is the most I'd suggest (which is also what we do for
the busy loop when deleting partitions). I still think it's up to the
caller to make sure the nodes are stable before re-running parted. It's
not really a good idea to call it in a fast loop, and if you look at the
parted tests in ./tests/ you can see that we've solved this by waiting
for the device nodes to reappear.
wait_for_dev_to_appear_ and wait_for_dev_to_disappear_
> - It should be implemented for everything that udev touches. Right now
> this patch only changes one function.
>
> [GULAM]: I am not much aware about the code of parted command. Can you
> please explore more here so that I can get idea about the parts which
> udev touches?
Sorry I actually meant all the /sys/block/ paths, looks like we have 3
low level functions opening those. I'd suggest a common open_sys_block
function that opens a path with a timeout. I'd probably combine the
timeout values for this and the ones for _disk_sync_part_table into
defines at the top of linux.c
> - Can we reliable test for this so we can add a test? In my
> experience, I'd say maybe :)
>
> [GULAM]: I tested this with small shell script. Yes, we can add a test
> for this. Can you please let me know if there is any test framework
> where I can add the test?
Take a look at the tests in ./tests/, if you could start by adding tests
there that fail that would be a good starting point. Then others could
run it on the various supported systems (fedora, rhel, ubuntu, etc.) and
make sure it actually fails as expected before we add more code.
I'm on vacation for the next 2 weeks, so any replies may take a while
for me to respond to. I'd also like to get more feedback about this from
other parted users since I'm still pretty reluctant to add retry loops.
Brian
--
Brian C. Lane (PST8PDT) - weldr.io - lorax - parted - pykickstart
More information about the parted-devel
mailing list