[parted-devel] [rfc] SSD partition alignment

Daniel J Blueman daniel.blueman at gmail.com
Sun Feb 22 13:27:40 UTC 2009


On Sun, Feb 22, 2009 at 11:39 AM, Jim Meyering <jim at meyering.net> wrote:
> Colin Watson wrote:
>> On Sun, Feb 22, 2009 at 01:40:16AM +0100, Jim Meyering wrote:
>>> Daniel J Blueman wrote:
>>> > I've checked into this, and since libparted sees the SATA block device
>>> > as SCSI, it doesn't perform the expected ATA 'identify' command to
>>> > fill out the 512 bytes of device info, of which (short) word 217 is
>>> > device RPM, defined to be 1 on newer compliant SSDs. The kernel uses
>>> > this word to detect if a device is an SSD or not, so I suggest we use
>>> > the same.
>>> >
>>> > Anyone think of objections to calling the ATA identify ioctl to fill
>>> > out the structure, then storing this flat for later use in constraint
>>> > checking? If the SCSI device supports it also, fine, else nothing
>>> > lost.
>>> >
>>> > For now, a 1MB starting offset for an SSD seems safest, and is what MS
>>> > Windows 7 and Server 2008 use, thus a number of vendors will also be
>>> > testing/optimising with this case too.
>>>
>>> Does this really need to be SSD-specific?
>>>
>>> I hear that this (alignment) is high priority also for many
>>> of the big new disks, since they have 4k-byte sectors.
>>> Without better alignment, their performance will suffer, too.
>>
>> Well, one step at a time. We can detect SSD; can we detect those big new
>> disks (or, in general, the desired sector size)?
>
> Alignment-related changes that are useful for SSDs will also benefit
> other hardware, so we'd be remiss not to consider that up-front.

> I think we agree:
> Writing a "device-is-an-SSD" function that queries the kernel, and
> then having some caller use that to look up reasonable-for-SSD
> alignment parameters would be useful.  Just make it general
> enough to also work with e.g., a "device-has-4kb-sector" function.
>
> However, I'm beginning to wonder if that'd be more appropriate
> at a higher level than parted, like in gparted.  More below.

For the next few years of users having current SSDs, RAID arrays etc,
the kernel detection isn't working out and won't. Kernel developers
disabled Compact Flash cards being marked as non-rotational also,
since a number of microdrives are CFA devices. I've read the SSDs in
most netbooks also don't report rotational RPM (word 217) as 1, and of
course are most performance sensitive (being slower), so we need to
consider this.

>> Or are you saying that we should increase the alignment to 4KB in
>> general?

4KB, 128KB or 1MB, but I think 128KB is enough for the erase block
alignment, and will help most RAID arrays I believe. I've found
misaligned striping on eg RAID-5 arrays can hurt with certain
workloads.

> Changing the default for both interactive and -s might be easiest,
> but IMHO, that is not an option.

I can see some reasons, but which ones do you see?

>> It seems that SSDs actually really want 128KB, which starts to
>> feel like a bit much to apply to all disks:
>>
>>   http://thunk.org/tytso/blog/2009/02/20/aligning-filesystems-to-an-ssds-erase-block-size/
>
> Yes, I read that, too.
> And so have others, which prompted some IRC discussion Friday morning.
>
> But let's back up a step.
> Do we really need to change anything?
> Or are you proposing (like I think Eric was) a way to
> make this merely more convenient?

The question is: will the average user know and be prepared to do
this? Probably not, so will experience a performance/longevity penalty
which could have been avoided perhaps...

> I can already create partitions aligned to 128KiB boundaries.
> This creates a first partition of just less than 1GiB,
> and the second taking up the remainder of the space
> and also using a size that's a multiple of 128KiB:
>
>    dev=file; : > $file
>    k=1024 m=$(($k*$k)) g=$(($k*$k*$k))
>    dd if=/dev/null of=$dev bs=1 seek=32GiB
>    parted -s $dev mklabel gpt
>    parted -s $dev u B mkpart primary $((128*$k)) $(($g-1))
>    parted -s $dev u B mkpart primary $g $((32*$g - $m - 1))
>    parted -s $dev u B p
>
>    Model:  (file)
>    Disk /t/file: 34359738368B
>    Sector size (logical/physical): 512B/512B
>    Partition Table: gpt
>
>    Number  Start        End           Size          File system  Name     Flags
>     1      131072B      1073741823B   1073610752B                primary
>     2      1073741824B  34358689791B  33284947968B               primary
>
>
> Now, I think that this functionality
> (snap-to-user-specified-or-system-derived-alignment) belongs in gparted,
> and not in parted.

Yes, if we propose to add an option to say "tick, I know I have an
SSD", but this adds more unnecessary user complexity, when the cost to
non-SSDs is so low.

What's (at worst) 128KB slack in partition layouts, when we already
skip the first 63 sectors anyway?

Daniel
-- 
Daniel J Blueman



More information about the parted-devel mailing list