The 3 main parameters are:
agcount: Number of allocation groupssunit: Stripe size (as configured on your RAID controller)swidth: Stripe width (number of data disks, excluding parity / spare disks)
sunit= 256k / 512 = 512, becausesunitis in multiple of 512 byte sectorsswidth= 6 * 512 = 3072, because in a RAID10 with 12 disks we have 6 data disks excluding parity disks (and no hot spares in this case)
$ sudo mkfs.xfs -f -d sunit=512,swidth=$((512*6)),agcount=16 /dev/sdbNow from the output above, we can see 2 problems:
Warning: AG size is a multiple of stripe width. This can cause performance
problems by aligning all AGs on the same disk. To avoid this, run mkfs with
an AG size that is one stripe unit smaller, for example 182845376.
meta-data=/dev/sdb isize=256 agcount=16, agsize=182845440 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=2925527040, imaxpct=5
= sunit=64 swidth=384 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=64 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
- There's this warning message we better pay attention to.
- The values of
sunitandswidthprinted don't correspond to what we asked for.
bsize=4096, so sure enough the numbers match up: 4096 x 64 = 512 x 512 = our stripe size of 256k.
Now let's look at this warning message. It suggests us to use
agsize=182845376 instead of agsize=182845440. When we specified the number of AGs we wanted, XFS automatically figured the size of each AG, but then it's complaining that this size is suboptimal. Yay. Now agsize is specified in blocks (so multiples of 4096), but the command line tool expects the value in bytes. At this point you're probably thinking like me: "you must be kidding me, right? Some options are in bytes, some in sectors, some in blocks?!" Yes.
So to make it all work:
$ sudo mkfs.xfs -f -d sunit=512,swidth=$((512*6)),agsize=$((182845376*4096)) /dev/sdbIt's critical that you get this right before you start using the filesystem. There's no way to change them later. You might be tempted to try using
meta-data=/dev/sdb isize=256 agcount=16, agsize=182845376 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=2925526016, imaxpct=5
= sunit=64 swidth=384 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=64 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount -o remount,sunit=X,swidth=Y, and the command will succeed but do nothing. The only XFS parameter you can change at runtime is nobarrier (see the source code of XFS's remount support in the Linux kernel), which you should use if you have a battery-backup unit (BBU) on your RAID card, although the performance boost seems pretty small on DB-type workloads, even with 512MB of RAM on the controller.
Next post: how much of a performance difference is there when you give XFS the right
sunit/swidth parameters, and does this allow XFS to beat ext4's performance.
0 comments:
Post a Comment