[ofa-general] [PATCH] IPoIB UD 4K MTU support

Eli Cohen eli at dev.mellanox.co.il
Thu Jan 31 02:11:43 PST 2008


On Sat, 2008-01-26 at 01:44 -0800, Shirley Ma wrote:
> > However I came up with a tricky approach that might work well.  We
> > would use two-element scatter lists for the receives, and post a
> > 40-byte dummy buffer first and then a 4096 byte buffer for the actual
> > packet.  Since the only thing we do with the first 40 bytes is throw
> > them away, we wouldn't even have to make the 40 bytes part of the skb;
> > in fact we could have one buffer that every receive uses and never
> > even touch the first entry of the scatter list after initialization.
> > It would even save the skb_pull(skb, IB_GRH_BYTES); we currently do
> > after receiving messages.
> > 
> > What do you think?
> 
> I thought the same thing before for one buffer allocation, I had a
> little bit concern about whether IB_GRH could be used later. I have done
> scatter-gather list patch already. It's based on the PAGE_SIZE whether
> to use one buffer or two buffer, similar as IPoIB-CM S/G code. It's
> under testing. The only thing I haven't finished is making S/G code more
> generica and merge IPoIB-CM S/G and IPoIB-UD S/G buffer allocation
> togather. Since IBM eHCA does support 4K MTU and we would like our
> customer to use this feature in OFED-1.3 release. If I merge the
> IPoIB-CM S/G code and IPoIB-UD S/G code, it would take much longer for
> testing. I wonder whether it's OK to push IPoIB-UD S/G first then merge
> IPoIB-UD and IPoIB-CM later.
> 

I don't think it's a good idea to make the code more generic and use the
same rx buffer scheme for both UD and CM. In the case of UD we would
need 2 scatter entries - on for the dummy GRH bytes which can be
initialized once to point at the same buffer, and the second to point to
the real data buffer. I prefer to modify the UD code to work with 4K
MTU.
The reasoning is that CM scatters can reach on overall size of 17 which
would require more memory and would consume more CPU cycles to handle
(e.g. when a packet is received). This can be crucial for the cases
where small UDP packets performance is needed.




More information about the general mailing list