[ofa-general] [PATCH] IPoIB UD 4K MTU support
Shirley Ma
mashirle at us.ibm.com
Thu Jan 24 05:34:15 PST 2008
On Thu, 2008-01-24 at 13:29 -0800, Roland Dreier wrote:
> OK, this is some half-baked thinking based on reading the patch. I
> don't know the right answer here -- I am hoping to spark discussion
> that makes the correct thing clear:
>
> > +static inline int ipoib_ud_mtu(unsigned int ib_mtu)
> > +{
> > + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) :
> > + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4);
> > +}
>
> reading this, my first reaction is that the magic 4096 constant should
> have a name. And in fact the most obvious name for it is PAGE_SIZE.
> However, this means that (assuming everyone can handle an IB MTU of
> 4096), systems with PAGE_SIZE > 4096 would come up with a different
> IPoIB MTU than systems with PAGE_SIZE == 4096. And I'm not sure
> whether that would cause problems or not. (eg TCP should be OK)
We could use ib_mtu_enum_to_int(IB_MTU_4096) here. TCP would be OK since
it negotiates mss value, but not UDP if we have one node PAGE_SIZE
bigger than 4096.
> But then in general, if we use the approach here (which is very
> appealing because it's so simple), Linux will potentially have an MTU
> different from other OSes that might choose a different way to handle
> an IB MTU of 4096. So does that mean that we should use a more
> complicated approach to get the max possible MTU of 4096 - 4?
Actually I thought about this when I came to this simple implementation.
If we use 4096-48, a patch in IPoIB to generate ICMP error could help
this issue by sending the 4096-48 mtu back so the source knows how big
its packets could be. Do you this this is a good idea?
thanks
Shirley
More information about the general
mailing list