<br><font size=2><tt>MIchael,</tt></font>

<br>

<br><font size=2><tt>"Michael S. Tsirkin" <mst@mellanox.co.il>

wrote on 04/27/2006 01:14:06 PM:<br>

> Quoting r. Shirley Ma <xma@us.ibm.com>:<br>

> > So far work queue gives very consistent 15% througput increase

in my<br>

> > local test with one dual core cpu over mthca.<br>

> What happens to the CPU utilization? And latency?<br>

</tt></font>

<br><font size=2><tt>The CPU utilization were doubled under one dual core

cpu.</tt></font>

<br><font size=2><tt>Finally I found that's the problem tx_ring blocked

the sender.</tt></font>

<br>

<br><font size=2><tt>After I tune send/recv queues, I got more than double

throughput for </tt></font>

<br><font size=2><tt>unidirectional netperf with 10-15% more cpu utilization.</tt></font>

<br>

<br><font size=2><tt>I believe after I apply my other removing tx_ring

patch, the performance </tt></font>

<br><font size=2><tt>would be better.</tt></font>

<br><font size=2><tt><br>

> > I am planning to add one more cpu to see the difference.<br>

> <br>

> And what happens on UP?<br>

</tt></font>

<br><font size=2><tt>Don't have a UP.</tt></font>

<br>

<br><font size=2><tt>The patch needs to be verified on large cluster to

see how's packets</tt></font>

<br><font size=2><tt>out of order. I have tried on 8 cpus, it did good.</tt></font>

<br>

<br><font size=2><tt>Thanks</tt></font>

<br><font size=2 face="sans-serif">Shirley Ma<br>

IBM Linux Technology Center<br>

15300 SW Koll Parkway<br>

Beaverton, OR 97006-6063<br>

Phone(Fax): (503) 578-7638<br>

<br>

</font>