<br><font size=2 face="sans-serif">Roland,</font>

<br>

<br><font size=2 face="sans-serif">By all the data I have collected so

far I think it's not a good idea to have while loop poll_cq() under IB

hardware interrupt context. poll_cq() is very expensive, and it increases

other hardwares' interrupt latency. If we move this out of hardware interrupt

context, latency would be inceased anyway.</font>

<br>

<br><font size=2 face="sans-serif">I have done lots of tests on splitting

CQ + work queue on recv/send + remove tx_ring patches over mthca. Both

SMP and UP unidirectional throughput gets improved from 20% - 75% w o/i

tuning. The latency has increased between 4-10% on mthca. The interesting

result is UP performance is good. I used hyperthread CPU running all these

tests, don't know whether it's the reason.</font>

<br>

<br><font size=2 face="sans-serif">If you think there are enough time to

review these patches and have more chance to be merged into 2.6.17/18,

I will clean and submit these patches ASAP, and test on ehca if none multi-threads

ehca is available.</font>

<br>

<br><font size=2 face="sans-serif">Thanks<br>

Shirley Ma<br>

IBM Linux Technology Center<br>

15300 SW Koll Parkway<br>

Beaverton, OR 97006-6063<br>

Phone(Fax): (503) 578-7638</font>