***SPAM*** Re: [Scst-devel] [ofa-general] SRP/mlx4 interrupts throttling performance

Vladislav Bolkhovitin vst at vlnb.net
Tue Feb 24 09:54:01 PST 2009


Cameron Harr, on 02/24/2009 08:18 PM wrote:
> Vladislav Bolkhovitin wrote:
>>> Vladislav Bolkhovitin wrote:
>>>> Try the following variants:
>>>>
>>>> 1. Affine IRQ 82, scsi_tgt0 to CPU0, fct0-worker to CPU2, IRQs 169 
>>>> and 177 to CPU4, scsi_tgt1 to CPU1, fct1-worker to CPU3, scsi_tgt2 
>>>> to CPU5, fct2-worker to CPU7
>>>>
>>>> 2. Affine IRQ 82 to CPU0, fct0-worker to CPU2, IRQs 169 and 177 to 
>>>> CPU4, fct1-worker to CPU3, fct2-worker to CPU7, no affinity for 
>>>> other processes.
>>>>
>>>> 3. Affine IRQ 82 to CPU0, IRQs 169 and 177 to CPU4, fct1-worker's to 
>>>> all CPUs, except CPU0 and CPU4, no affinity for other processes.
>>> These are tests 1, 2 and 3, respectively
>>>> Or other similar variants you'd like (even CPUs relate to physical 
>>>> CPU0, odd CPUs relate to physical CPU1). For instance, you can try 
>>>> to affine IRQs 169 and 177 to CPU1.
>>> I did two other tests (Tests 4,5), that has the mlx4_core (comp) IRQ 
>>> (formerly known as IRQ 82) pinned to CPU0, the two ioDrive IRQs (169, 
>>> 177) pinned to CPU 4, fct0 and scsi_tgt0 on CPUs 2&3, fct1 and 
>>> scsi_tgt1 on CPUs 4&6 (test 4) OR fct1 and scsi_tgt1 on CPUs 5&6.
>>>> No points to run for srptthread=1, for it just produce a baseline 
>>>> with no affinity at all.
>>> I ran with these anyway to look at differences among the tests. 
>>> Having this thread enabled always results in better performance.
>>>> Please do each run several times and write down an average result 
>>>> between runs and approximate variation between them in %%. Otherwise 
>>>> we can't make any reliable conclusions.
>>> I ran each test 3 times and took the averages. In order to get a 
>>> quick look at performance per run, I added a column in the summary 
>>> that sums the IOPs for each test with SRPT thread enabled and then 
>>> not enabled. Test 4 seems to give the best results. Here's a brief 
>>> summary of that summary with just SRPT thread=0:
>>>
>>> Baseline: 356226.39
>>> Test 1:   371217.6533
>>> Test 2:   370553.78
>>> Test 3:   373295.2033
>>> Test 4:   399385.2233
>>> Test 5:   393204.5833
>> Linux CPU scheduler does really impressive job!
>>
>> Interesting, will something change with:
>>
>> 1. The latest SVN. It has some changes, which might make a difference.
> Sorry for the delay.
> This is with SVN rev 673. I don't hit the high I hit before, but at a 
> 1.8% difference (with test 4), it's statistically noise.
> 
> Test 1: 390631.5133
> Test 2: 386125.4133
> Test 3: 356268.0267
> Test 4: 392237.7867
> Test 5: 390012.1467
>> 2. Pass-through dev handler instead of BLOCKIO, which you are using.
>>
> The ioDrive driver doesn't provide a full SCSI emulation layer and shows 
> up as /dev/fio[abc...]. From my understanding of the pass-through 
> handler, I need to have the SCSI Host:Channel:ID:LUN and those aren't 
> available to me.

Yes. Although this is strange, because you use sdX devices, hence they 
should have full SCSI emulation and lsscsi should show the 
Host:Channel:ID:LUN numbers.

Thanks,
Vlad



More information about the general mailing list