[ofa-general] SRP/mlx4 interrupts throttling performance

Tue Oct 7 15:46:14 PDT 2008

Cameron Harr wrote:
> Cameron Harr wrote:
>> I may be hitting the instability problems and am currently rebooting 
>> my initiators again after the test (FIO) went into zombie-mode.
>>
>> When I first set thread=0, with scst_threads=8, my performance was 
>> much lower (around 50-60K IOPs) than normal and it appeared that only 
>> one target could be written to at a time. I set scst_threads=2 after 
>> that and got pretty wide performance differences, between 55K and 85K 
>> IOPs. I then brought in another initiator and was seeing numbers as 
>> high as 135K IOPs and as low as 70K IPs, but could also see that a 
>> lot of the requests were being coalesced by the time they got to the 
>> target. I let it run for a while, and when I came back, the tests 
>> were still "running" but no work was being done and the processes 
>> couldn't be killed.
>>
>
> One thing that makes results hard to interpret is that they vary 
> enormously. I've been doing more testing with 3 physical LUNs (instead 
> of two) on the target, srpt_thread=0, and changing between 
> scst_thread=[1,2,3]. With scst_thread=1, I'm fairly low (50K IOPs), 
> while at 2 and three threads, the results are higher, though in all 
> cases, the context switches are low, often less than 1:1.
>

Can you test again with srpt_thread=0,1 and scst_threads=1,2,3 in NULLIO 
mode (with 1,2,3 export NULLIO luns)

> My best performance comes with scst_threads=3 again (and are often 
> pegged at 100% CPU), but the results seem to go in phases between the 
> low 80s, the 90s, 110s, 120s and will run for a while in 130s and 140s 
> (in thousands of IOPs). For reference, locally on the three LUNs I get 
> around 130-150K IOPs. But the numbers really vary.
>
> Also a little disconcerting is that my average request size on the 
> target has gotten larger. I'm always writing 512B packets, and when I 
> run on one initiator, the average reqsz is around 600-800B. When I add 
> an initiator, the average reqsz basically doubles and is now around 
> 1200 - 1600B. I'm specifying direct IO in the test and scst is 
> configured as blockio (and thus direct IO), but it appears something 
> is cached at some point and seems to be coalesced when another 
> initiator is involved. Does this seem odd or normal? This shows true 
> whether the initiators are writing to different partitions on the same 
> LUN or the same LUN with no partitions.

What io scheduler are you running on local storage? Since you are using 
blockio you should play around with io scheduler's tuned parameters (for 
example deadline scheduler: front_merges, write_starved,...) Please see 
~/Documentation/block/*.txt