[ofa-general] [PATCH] IPoIB UD 4K MTU support

Shirley Ma mashirle at us.ibm.com
Thu Jan 24 05:34:15 PST 2008


On Thu, 2008-01-24 at 13:29 -0800, Roland Dreier wrote:
> OK, this is some half-baked thinking based on reading the patch.  I
> don't know the right answer here -- I am hoping to spark discussion
> that makes the correct thing clear:
> 
>  > +static inline int ipoib_ud_mtu(unsigned int ib_mtu) 
>  > +{
>  > +	return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) :
>  > +				 (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4);
>  > +}
> 
> reading this, my first reaction is that the magic 4096 constant should
> have a name.  And in fact the most obvious name for it is PAGE_SIZE.
> However, this means that (assuming everyone can handle an IB MTU of
> 4096), systems with PAGE_SIZE > 4096 would come up with a different
> IPoIB MTU than systems with PAGE_SIZE == 4096.  And I'm not sure
> whether that would cause problems or not.  (eg TCP should be OK)

We could use ib_mtu_enum_to_int(IB_MTU_4096) here. TCP would be OK since
it negotiates mss value, but not UDP if we have one node PAGE_SIZE
bigger than 4096.

> But then in general, if we use the approach here (which is very
> appealing because it's so simple), Linux will potentially have an MTU
> different from other OSes that might choose a different way to handle
> an IB MTU of 4096.  So does that mean that we should use a more
> complicated approach to get the max possible MTU of 4096 - 4?

Actually I thought about this when I came to this simple implementation.
If we use 4096-48, a patch in IPoIB to generate ICMP error could help
this issue by sending the 4096-48 mtu back so the source knows how big
its packets could be. Do you this this is a good idea?

thanks
Shirley




More information about the general mailing list