[ofa-general] [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support

Shirley Ma mashirle at us.ibm.com
Wed Jan 30 10:42:20 PST 2008


The current IPoIB-UD implementation is limited IPoIB payload size to
2048 through hard coding IPOIB_PACKET_SIZE. The implementation is
designed for kernel PAGE_SIZE equals or greater than 4K. If the kernel
PAGE_SIZE is equals to 2K, memory buffer allocation will be failure when
lack of large buffer of memory. However most of the Distros does support
PAGE_SIZE >= 4K. So this implementation has no problem for 2048 payload.
This implementation is simple but it prevents HCA device who does
support 4096 payload from performing, like IBM eHCA2.

This patch allows IPoIB-UD MTU up to 4092 (4K - IPOIB_ENCAP_LEN) when
HCA can support 4K MTU. In this patch, APIs for S/G buffer allocation in
IPoIB-CM mode has been made generic so IPoIB-UD and IPoIB-CM can share
the S/G code. When PAGE_SIZE is equal or greater than IPOIB_UD_BUF_SIZE
+ bytes padding to align IP header, Only one buffer is needed for 4K MTU
buffer allocation, otherwise, two buffers allocation is needed in S/G.
The node IPoIB link MTU size is the minimum value of admin configurable
MTU through ifconfig and IPoIB default broadcast group MTU size. When
Subnet Manager enables default broadcast group during start up, this
subnet IPoIB link MTU will be the value of default broadcast group MTU
size. For any node IB MTU smaller than this value, the node can't join
this IPoIB subnet. For any node IB MTU is greater than this value, the
node will join this IPoIB subnet and this value will be set as its IPOIB
link MTU. If Subnet Manager disables default broadcast group during
start up, the first bring up node in this subnet will create the default
IPoIB broadcast group based on the negotiation with the Subnet Manager,
the default is currently set as 2K according to IPoIB RFC.

The patch will be splitted into two patches:

1. Make IPoIB-CM RX S/G APIs generic
2. Enable IPoIB-UD RX S/G

I am trying to split these two patches more independent so it's easy to
test. ipoib_cm_alloc_rx_skb() will be renamed in second patch. Please
review these patches as soon as possible so we can include this in
OFED-1.3-RC4.

Appreciate your help on time.

Thanks
Shirley




More information about the general mailing list