[ofa-general] Re: [RFC][1/2] IPoIB UD 4K MTU support

Shirley Ma mashirle at us.ibm.com
Tue Apr 1 03:52:04 PDT 2008


Hello Or,

Thanks for your view.

> Reading ipoib_ud_skb_put_frags below and its usage in the patch that follows, its unclear to me if IPOIB_UD_MAX_PAYLOAD is being made of (4K - IPOIB_ENCAP_LEN) + IPOIB_ENCAP_LEN or from adjustment to some IP header alignment constraint. Specifically, the design I'd like to see here is that the IPoIB header telling the type of the frame (ARP, IPv4, IPv6, etc) is provided up to the stack as part of the packet in the skb (eg its very useful with tcpdump/etc filters). 

The max payload is the max IB mtu here. It's 4K. IPoIB mtu = IB-mtu -
IPoIB header = 4K - 4.

> Reading earlier threads I see that Roland suggested to allow for upto 4K-4 mtu towards the stack and use some internal buffer for the GRH where this buffer can be allocated and dma mapped once and being forget from till the driver cleans up, etc. Was there any problem with this approach?

The implementation here is using one buffer for PAGE_SIZE greater than
4K. Using two buffers for PAGE_SIZE = 4K. One buffer is 4K-4 IPoIB MTU
which contains data. One buffer is 44 bytes header (GRH header + IPoIB
header). I uses a generic routine for IPoIB receiving path regardless of
MTU size, it significantly reduces the size of the patch.

We can't just dam map once for this combined header(GRH header + IPoIB
header) buffer, GRH header is obsoleted, but not IPoIB header, right?

What Roland suggested before was to have GRH in one buffer, IPoIB header
and data in the second buffer. If we do so, the total size of the second
buffer is 4K, plus the IP header alignment (12 bytes), it will exceed
one page size, which is the problem we are trying to solve here.

> > +static inline void ipoib_ud_skb_put_frags(struct ipoib_dev_priv *priv,
> > +					  struct sk_buff *skb,
> > +					  unsigned int length)
> > +{
> > +	if (ipoib_ud_need_sg(priv->max_ib_mtu)) {
> > +		skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
> > +		/*
> > +		 * There is only two buffers needed for max_payload = 4K,
> > +		 * first buf size is IPOIB_UD_HEAD_SIZE
> > +		 */
> > +		skb->tail += IPOIB_UD_HEAD_SIZE;
> > +		frag->size = length - IPOIB_UD_HEAD_SIZE;
> > +		skb->data_len += frag->size;
> > +		skb->truesize += frag->size;
> > +		skb->len += length;
> > +	} else
> > +		skb_put(skb, length);
> > +
> > +}
> >   
> I fail to follow what this code really wants to do and how it does it. 
> Is there a must to touch "by hand" all the internal skb fields?

Since there is only two S/G, this way uses less instructions for
adjusting length of skb with fragments to match received data. You can
refer other device driver like skb_put_frags() is used in ipoib-cm.

>  also 
> this function is called once by ipoib_ib_handle_rx_wc in the patch that 
> follows, any reason not to make it static over there?

It's not necessary to be there, I can move this function to ipoib_ib.c
instead.

Thanks
Shirley




More information about the general mailing list