[openib-general] [PATCHv4] IPoIB CM Experimental support

Bernard King-Smith wombat2 at us.ibm.com
Mon Jan 8 12:49:14 PST 2007


----- Message from "Michael S. Tsirkin" <mst at mellanox.co.il> on Mon,
> 8 Jan 2007 18:57:14 +0200 -----
> 
> To:
> 
> openib-general at openib.org, "Roland Dreier" <rolandd at cisco.com>
> 
> Subject:
> 
> [openib-general] [PATCHv4] IPoIB CM Experimental support
> 
> The following patch adds experimental support for IPoIB connected mode.
> The idea is to increase performance by increasing the MTU
> from the maximum of 2K (theoretically 4K) supported by IPoIB on top of 
UD.
> With this code, I'm able to get 800MByte/sec or more with netperf
> without options on a Mellanox 4x back-to-back DDR system.
> 
> Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
> 
> ---
> 
> Sorry about the churn, just fixed a bug in this code.

[SNIP] 
> e. Some notes on code
> 1. SRQ is used for scalability to large cluster sizes

I still want to support non-SRQ adapters with this code. Not all systems 
have 100's or 1000's of endpoints and those smaller systems will benefit 
from IPoIB-CM. The larger systems tend to have larger memory per node so 
can support the additional memory requirements. 

At the November meeting one of the main themes from application developers 
and customers is we must have a well performing TCP/IP story across as 
much of the IB space as possible. If only one or two of the IB adapters 
perform well, then we haven't addressed the customer needs. Those adapters 
that can't support RC is one issue, but for those who do without SRQ, 
smaller configurations should be able to use IPoIB-CM.

> 2. Only RC connections are used (UC does not support SRQ now)
> 3. Retry count is set to 0 since spec draft warns against retries
> 4. Each connection is used for data transfers in only 1 direction,
>    so each connection is either active(TX) or passive (RX).
>    2 sides that want to communicate create 2 connections.
> 5. Each active (TX) connection has a separate CQ for send completions -
>    this keeps the code simple without CQ resize and other tricks
> 

Bernie King-Smith 
IBM Corporation
Server Group
Cluster System Performance 
wombat2 at us.ibm.com    (845)433-8483
Tie. 293-8483 or wombat2 on NOTES 

"We are not responsible for the world we are born into, only for the world 
we leave when we die.
So we have to accept what has gone before us and work to change the only 
thing we can,
-- The Future." William Shatner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070108/83ac1e12/attachment.html>


More information about the general mailing list