[Parted-maintainers] Bug#923561: parted: Incorrect optimal alignment for USB device

Phillip Susi phill at thesusis.net
Wed Mar 6 14:47:57 GMT 2019


On 3/5/2019 5:49 PM, Kevin Locke wrote:
>> md does it using the stripe size.  Not sure if anything other the md or
>> dm would make sense to populate the value.  Well, I guess hardware raid
>> drivers.
> 
> Sounds reasonable to me.  Feel free to propose it to the kernel
> maintainers.

I'd have to check to be sure, but I think hardware raid drivers already
do this.

> Documentation/ABI/testing/sysfs-block does not say "normal disks
> generally leave it 0", it says "If no optimal I/O size is reported
> this file contains 0."  SCSI disks report an optimal I/O size via VPD.

My copy says "This is rarely reported for disk drives." then mentions
raid.  The full description I have is:

                Storage devices may report an optimal I/O size, which is
                the device's preferred unit for sustained I/O.  This is
                rarely reported for disk drives.  For RAID arrays it is
                usually the stripe width or the internal track size.  A
                properly aligned multiple of optimal_io_size is the
                preferred request size for workloads where sustained
                throughput is desired.  If no optimal I/O size is
                reported this file contains 0.

> I still think the documentation here is correct.  If you disagree,
> feel free to report it to the kernel maintainers.

What defines "correct"?  I don't think it is a question of whether it is
correct, but whether it is clear and unambiguous, and whether various
components have understood it the same way.  I think the text is
ambiguous since it specifically mentions that raid uses it to represent
the stripe size, which is a hint that the intended use is for optimal
alignment, despite the fact that the prescriptive text makes it sound
like it does not have anything to do with alignment.  After all, as a
general rule, the larger IO you use, the better the performance, so if
it had nothing to do with alignment, everyone should just set it to a
silly large value like your USB stick does, and then it would kind of be
useless, wouldn't it?

> Are there cases where the optimal partition alignment is not a
> multiple of the physical sector size?  If so, lets consider whether
> they can be worked into the sanity checking logic.  If not, are there
> other risks that you foresee which are not shared by util-linux and
> cryptsetup, which have been using such a sanity check for years?

No, there aren't, and I agree that this sanity check isn't likely to
hurt anything and will fix your USB stick issue.  What concerns me is
that drives which are not 512e like yours will report an absurdly large
optimal_io_size that *is* sector aligned, and this sanity check won't
help with those.

> Also, if "your USB stick" was intended to suggest that this is not a
> common problem, I would disagree.  I suspect it occurs on most/all
> Seagate UAS drives (which share some other known problems[1]).

I simply meant that yours seems to be 512e and that is the only reason
it is even possible to have an optimal_io_size that is not a multiple of
the physical block size.

I think it may be worth starting another discussion upstream about this
to clarify and perhaps have the documentation improved, but I suspect
that in the end, this value is just going to remain a clusterfudge and
maybe parted should only rely on it if the disk in question is an md
device.  Then again, that might leave out hardware raids if they use the
stripe size too.  Maybe a size sanity check?  Anything above say, 8M or
16M maybe should be ignored?  It's a shame that this value is unreliable
and we have to resort to second guessing it.



More information about the Parted-maintainers mailing list