Bug#500157: grub-common: no such disk after un- and replugging disk belonging to RAID1 array

Vladimir 'φ-coder/phcoder' Serbinenko phcoder at gmail.com
Sat Apr 2 15:29:41 UTC 2011


>> In meanwhile many RAID fixes went in. Could you, please, restest?
>>     
> To be honest I have not dared to replug the other disk and so only one
> disk is used since over two years. I am quite reluctant to reattach the
> other disk fearing again some errors with GRUB or problems with
> reassembling the RAID.
>
>   
Yes, I understand. GRUB is the least of your concerns. It never writes
to mdraid disks and so can't corrupt it. However, I can tell the same of
running system
So I wouldn't try reassembling such RAID either. RAID reassembling can
be arbitrary confused with the disks being so out of sync. RAID
implementations usually are supposed to work with all disks in sync and
some devices missing. Working out of these limits can be very dangerous
for data. If you don't need the data stored on that disk I'd reformat it
and add to raid copying everything from working disk.
However this being said I'm not mdraid expert and so I can't tell which
safe solutions are available.
>> Also note that if two RAID members who are supposed to be exact copies
>> happen to contain different versions of GRUB, it may lead to version
>> mismatch between core and modules.
>>     
> Could you please point me to a write up how this is supposed to work
> anyhow. On the running disk I now have the current GRUB version
> 1.99~rc1-8 running and on the unplugged disk with the GRUB version from
> two and a half year ago. When both are plugged in how does the system
> decide which one to use?
>
>   
You report indicates that the two arrays were detected (probably desync
was too severe):

"ls does show (md0) (md0)"
But both arrays claim to have an index '0' which confused both GRUB naming and caching. I've fixed a bug by adding a check which would allocate another number whenever
there would be a collision otherwise.
In case of raid0 GRUB just uses one of raid members, so even if the data isn't the latest one, it's at least, internally consistent.


However the inability of recovering from a desync is the plague of RAID, I don't know if mdraid managed to solve it with some kind of journaling and how well it was solved,
I wouldn't risk my data on this

 If you decide that you don't want to retest it, we could simply close
this bugreport
> PS: You can keep the threading by importing the mbox of the BTS messages
> using `bts show --mbox 500157`. `bts` in the Debian package devscripts.
>   
Thanks but doesn't matter much now since I'm already subscribed to
pkg-grub-devel

-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 294 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-grub-devel/attachments/20110402/f17a2735/attachment.pgp>


More information about the Pkg-grub-devel mailing list