[openib-general] RFC: struct netdevice changes for IPoIB UC support

Michael S. Tsirkin mst at mellanox.co.il
Wed Sep 21 01:02:30 PDT 2005


Hi!
I am working on IP over InfiniBand net device support.
Existing code in mainline kernel only supports UD (unreliable datagram)
mode of operation, with max MTU of 2Kbyte.
I'm looking into support for UC (unreliable connected) mode of operation,
which can support MTU with theorectical limit up to 2Gbyte.

As was discussed on the openib list, one of the difficulties with
IP over IB support for UC mode, is the fact that the same device
has to support sending both UC (max MTU 2Gbyte) and UD (max MTU 2Kbyte)
packets, depending on packet link address.

I propose the following simple patch to let the netdevice override
the path MTU per dst entry. The patch was tested by modifying
existing IPoIB code to use MTU of 1K for some addresses, and 2K for
others.

Please comment on this approach: does it make sense to you guys?
Please Cc me directly, I'm not on the list.

Thanks a bunch,
MST

---

Make it possible for a network device to support more than one MTU value at a
time (depending on packet link address, or other criteria).

Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>

Index: linux-2.6.12.5/include/linux/netdevice.h
===================================================================
--- linux-2.6.12.5.orig/include/linux/netdevice.h
+++ linux-2.6.12.5/include/linux/netdevice.h
@@ -454,6 +454,10 @@ struct net_device
 #define HAVE_CHANGE_MTU
 	int			(*change_mtu)(struct net_device *dev, int new_mtu);
 
+#define HAVE_GET_MTU
+	u32			(*get_mtu)(struct net_device *dev,
+					   struct neighbour *neigh,
+					   int path_mtu);
 #define HAVE_TX_TIMEOUT
 	void			(*tx_timeout) (struct net_device *dev);
 
Index: linux-2.6.12.5/include/net/dst.h
===================================================================
--- linux-2.6.12.5.orig/include/net/dst.h
+++ linux-2.6.12.5/include/net/dst.h
@@ -111,7 +111,12 @@ dst_metric(const struct dst_entry *dst, 
 
 static inline u32 dst_mtu(const struct dst_entry *dst)
 {
-	u32 mtu = dst_metric(dst, RTAX_MTU);
+	u32 mtu;
+	if (dst->dev && dst->dev->get_mtu)
+		mtu = dst->dev->get_mtu(dst->dev, dst->neighbour,
+					dst_metric(dst, RTAX_MTU));
+	else
+		mtu = dst_metric(dst, RTAX_MTU);
 	/*
 	 * Alexey put it here, so ask him about it :)
 	 */

-- 
MST



More information about the general mailing list