Before I get into this, there are some provisos. This server was using Linux kernel 2.6.32. The SSDs involved are Samsung 850 Pro SATA-style solid state disks. SSD is not quite ready for prime time in the 2.6.32 kernel; NVMe support was first added in 3.3, TRIM wasn't available at all until 2.6.33,
and a ton of other things we all take for granted like the device mapper are part of the 4.* kernel.
Consumer-level Samsung drivers bring their own issues. Despite what the knuckle-heads on Reddit have to say about the topic, the Linux kernel still blacklists queued TRIM functions from every Samsung SSD in the 8** series. As of the latest Github commit as of this writing for kernel 4.8 queued TRIM still doesn't work for these devices.
More importantly, the R900 isn't a new server. This is an 8 year old box. There is a SAS backplane involved which, although having a theoretical max data transfer of 3.0 Gbps, was designed before SSDs were widely available, and introduces a bunch of contacts, wiring and complexity that is likely all screwed up and almost certainly not optimized for fat-guy Peta Belly Flops of computing power.
Initial benchmarking with fio and ioping in addition to monitoring CPU iowait times with top and checking out iostat had this server's SSDs performing *slower* than a similar server with 7500 RPM sata disks in a ZFS pool.
I did a bunch of stuff to this box hoping to shake a few extra IOPS out of it. I installed Dell's dsu to get my hands on the latest drivers & firmware (under the mistaken belief that an update on either front had been released in the last decade). I had never physically seen this server; so there was a lot of lspci-ing and modprobe-ing.
Luckily, I stayed focused on the controller & backplane a SAS 6/iR (FW 00.25.47.00.06.22.03.00) and LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) (FW 1.06), respecticely. Eventually I stumbled upon this post that described - accurately - how the Dell was automatically stepping down the SATA port speed on non-Dell-certified disks from SATA II (3gbps) to SATA I (1.5gbps).
Many moons ago, Dell's RAID cards would simply not allow users to install non-Dell disks. My experience with the R900 using BIOS version 1.2.0 would indicate that - although I am able to use non-certified disks without fatal errors, the backplane deliberately slows these disks down without reason, and in a way that is almost always transparent to the end user. I will hold off on making accusations here until I get my hands on the source code for this firmware, but the evidence up to this point is fairly damning. If anyone from Dell has an explanation for this sort of behavior, I would be happy to publish your feedback here.
- unzip the file in a directory of your choice; # unzip LSIUtil_1.62.zip -d /home/joshw/lsiutil/
- navigate to the directory referencing your OS; # cd /home/joshw/lsiutil/Linux/
- identify the version of the application matching your processor/OS bit type. For linux, there is a 32 bit, AMD64 and x86_64 version. I selected the x86_64 and applied an executable bit: # chmod +x lsiutil.x86_64
- make sure youre root: # sudo su
- run the application: # ./lsiutil.x86_64
You should see something like this:
LSI Logic MPT Configuration Utility, Version 1.62, January 14, 2009
1 MPT Port found
Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC
1. /proc/mpt/ioc0 LSI Logic SAS1068E B3 105 00192f00 0
Select a device: [1-1 or 0 to quit]
Select a Phy: [0-7, 8=AllPhys, RETURN to quit] 8
Again, you will be prompted for several values. You want to be very careful here as we only want to change one value - MinRate (this should be the second value you are prompted to modify. Every other value should remain default by pressing RETURN.
Link: [0=Disabled, 1=Enabled, or RETURN to not change]
MinRate: [0=1.5 Gbps, 1=3.0 Gbps, or RETURN to not change] 1
MaxRate: [0=1.5 Gbps, 1=3.0 Gbps, or RETURN to not change]
Initiator: [0=Disabled, 1=Enabled, or RETURN to not change]
Target: [0=Disabled, 1=Enabled, or RETURN to not change]
Port configuration: [1=Auto, 2=Narrow, 3=Wide, or RETURN to not change]
Once you've finished you will be dumped back to the port menu:
PhyNum Link MinRate MaxRate Initiator Target Port
0 Enabled 3.0 3.0 Enabled Disabled Auto
1 Enabled 3.0 3.0 Enabled Disabled Auto
2 Enabled 3.0 3.0 Enabled Disabled Auto
3 Enabled 3.0 3.0 Enabled Disabled Auto
4 Enabled 3.0 3.0 Enabled Disabled Auto
5 Enabled 3.0 3.0 Enabled Disabled Auto
6 Enabled 3.0 3.0 Enabled Disabled Auto
7 Enabled 3.0 3.0 Enabled Disabled Auto
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 4096MB)
Jobs: 1 (f=1): [m] [100.0% done] [74324K/24284K/0K /s] [18.6K/6071 /0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=59026: Fri Aug 26 19:10:52 2016
read : io=3070.6MB, bw=74180KB/s, iops=18545 , runt= 42386msec
write: io=1025.5MB, bw=24775KB/s, iops=6193 , runt= 42386msec
cpu : usr=8.79%, sys=55.13%, ctx=796212, majf=0, minf=20
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=786053/w=262523/d=0, short=r=0/w=0/d=0