[Scst-devel] [ofa-general] SRP/mlx4 interrupts throttling performance

Cameron Harr cameron at harr.org
Tue Jan 13 08:42:59 PST 2009


Vladislav Bolkhovitin wrote:
> Cameron Harr, on 01/13/2009 02:56 AM wrote:
>> Vladislav Bolkhovitin wrote:
>>>>> I think srptthread=0 performs worse in this case, because with it 
>>>>> part of processing done in SIRQ, but seems scheduler make it be 
>>>>> done on the same CPU as fct0-worker, which does the data transfer 
>>>>> to your SSD device job. And this thread is always consumes about 
>>>>> 100% CPU, so it has less CPU time, hence less overall performance.
>>>>>
>>>>> So, try to affine fctX-worker, SCST threads and SIRQ processing on 
>>>>> different CPUs and check again. You can affine threads using 
>>>>> utility from 
>>>>> http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/, 
>>>>> how to affine IRQ see Documentation/IRQ-affinity.txt in your 
>>>>> kernel tree. 
>>
>> I ran with the two fct-worker threads pinned to cpus 7,8, the 
>> scsi_tgt threads pinned to cpus 4, 5 or 6 and irqbalance pinned on 
>> cpus 1-3. I wasn't sure if I should play with the 8 ksoftirqd procs, 
>> since there is one process per cpu. From these results, I don't see a 
>> big difference, 
>
> Hmm, you sent me before the following results:
>
> type=randwrite  bs=4k   drives=1 scst_threads=1 srptthread=1 
> iops=54934.31
> type=randwrite  bs=4k   drives=1 scst_threads=1 srptthread=0 
> iops=50199.90
> type=randwrite  bs=4k   drives=1 scst_threads=2 srptthread=1 
> iops=51510.68
> type=randwrite  bs=4k   drives=1 scst_threads=2 srptthread=0 
> iops=49951.89
> type=randwrite  bs=4k   drives=1 scst_threads=3 srptthread=1 
> iops=51924.17
> type=randwrite  bs=4k   drives=1 scst_threads=3 srptthread=0 
> iops=49874.57
> type=randwrite  bs=4k   drives=2 scst_threads=1 srptthread=1 
> iops=79680.42
> type=randwrite  bs=4k   drives=2 scst_threads=1 srptthread=0 
> iops=74504.65
> type=randwrite  bs=4k   drives=2 scst_threads=2 srptthread=1 
> iops=78558.77
> type=randwrite  bs=4k   drives=2 scst_threads=2 srptthread=0 
> iops=75224.25
> type=randwrite  bs=4k   drives=2 scst_threads=3 srptthread=1 
> iops=75411.52
> type=randwrite  bs=4k   drives=2 scst_threads=3 srptthread=0 
> iops=73238.46
>
> I see quite a big improvement. For instance, for drives=1 
> scst_threads=1 srptthread=1 case it is 36%. Or, do you use different 
> hardware, so those results can't be compared?
Vlad, you've got a good eye. Unfortunately, those results can't really 
be compared because I believe the previous results were intentionally 
run in a worse-case performance scenario. However I did run no-affinity 
runs before the affinity runs and would say performance increase is 
variable and somewhat inconclusive:

type=randwrite  bs=4k   drives=1 scst_threads=1 srptthread=1 iops=76724.08
type=randwrite  bs=4k   drives=2 scst_threads=1 srptthread=1 iops=91318.28
type=randwrite  bs=4k   drives=1 scst_threads=2 srptthread=1 iops=60374.94
type=randwrite  bs=4k   drives=2 scst_threads=2 srptthread=1 iops=91618.18
type=randwrite  bs=4k   drives=1 scst_threads=3 srptthread=1 iops=63076.21
type=randwrite  bs=4k   drives=2 scst_threads=3 srptthread=1 iops=92251.24
type=randwrite  bs=4k   drives=1 scst_threads=1 srptthread=0 iops=50539.96
type=randwrite  bs=4k   drives=2 scst_threads=1 srptthread=0 iops=57884.80
type=randwrite  bs=4k   drives=1 scst_threads=2 srptthread=0 iops=54502.85
type=randwrite  bs=4k   drives=2 scst_threads=2 srptthread=0 iops=93230.44
type=randwrite  bs=4k   drives=1 scst_threads=3 srptthread=0 iops=55941.89
type=randwrite  bs=4k   drives=2 scst_threads=3 srptthread=0 iops=94480.92

>
>> but would still give srpt thread=1 a slight performance advantage.
>
> At this level CPU caches starting playing essential role. To get the 
> maximum performance the commands processing of each command should use 
> the same CPU L2+ cache(s), i.e. be done on the same physical CPU, but 
> on different cores. Most likely, affinity assigned by you was worse, 
> than the scheduler decisions. What's your CPU configuration? Please 
> send me the top/vmstat output during tests from the target as well as 
> your dmesg from the target just after it's booted.
My CPU config on the target (where I did the affinity) is 2 quad-core 
Xeon E5440 @ 2.83GHz. I didn't have my script configured to dump top and 
vmstat, so here's data from a rerun (and I have attached requested 
info). I'm not sure what is accounting for the spike at the beginning, 
but it seems consistent.

type=randwrite  bs=4k   drives=1 scst_threads=1 srptthread=1 iops=104699.43
type=randwrite  bs=4k   drives=2 scst_threads=1 srptthread=1 iops=133928.98
type=randwrite  bs=4k   drives=1 scst_threads=2 srptthread=1 iops=82736.73
type=randwrite  bs=4k   drives=2 scst_threads=2 srptthread=1 iops=82221.42
type=randwrite  bs=4k   drives=1 scst_threads=3 srptthread=1 iops=70203.53
type=randwrite  bs=4k   drives=2 scst_threads=3 srptthread=1 iops=85628.45
type=randwrite  bs=4k   drives=1 scst_threads=1 srptthread=0 iops=75646.90
type=randwrite  bs=4k   drives=2 scst_threads=1 srptthread=0 iops=87124.32
type=randwrite  bs=4k   drives=1 scst_threads=2 srptthread=0 iops=74545.84
type=randwrite  bs=4k   drives=2 scst_threads=2 srptthread=0 iops=88348.71
type=randwrite  bs=4k   drives=1 scst_threads=3 srptthread=0 iops=71837.15
type=randwrite  bs=4k   drives=2 scst_threads=3 srptthread=0 iops=84387.22

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg.out
Type: application/octet-stream
Size: 34008 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090113/38bae937/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: top.target.bz2
Type: application/octet-stream
Size: 72060 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090113/38bae937/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vmstat.target.bz2
Type: application/octet-stream
Size: 15983 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090113/38bae937/attachment-0002.obj>


More information about the general mailing list