[ewg] Re: IPoIB panics on ipoib_cm_handle_rx_wc()

Shirley Ma mashirle at us.ibm.com
Fri Apr 25 00:59:41 PDT 2008


Hello Or,

We have seen skb_under_panic() in our test enviornment as well. It's
easy reproduce this with tcpdump on and off.

thanks
Shirley

On Thu, 2008-04-24 at 14:11 +0300, Or Gerlitz wrote:
> > https://bugs.openfabrics.org/show_bug.cgi?id=989
> > ------- Comment #13 from Kris.Corwin at hp.com  2008-04-23 13:23 -------
> > I think I found the problem and have a fix.
> >
> > In ipoib_cm_handle_rx_wc(), if the byte_len is < SKB_TSHOLD, a new sk_buff
> > is allocated.  The sk_buff's mac.raw is not initialized.  It sends the packet
> > up the stack to netif_receive_skb().  In netif_receive_skb(), skb->mac_len
> > is computed by subtracting mac.raw from nh.raw.  Since mac.raw was not
> > initialized, we get a very large number.  It eventually leads to a panic in
> > skb_under_panic.
> >
> > The diff of the fix:
> > drivers/infiniband/ulp/ipoib/ipoib_cm.c.pre_kris
> > drivers/infiniband/ulp/ipoib/ipoib_cm.c
> > *** drivers/infiniband/ulp/ipoib/ipoib_cm.c.pre_kris    2008-04-23 16:05:26.000000000 -0400
> > --- drivers/infiniband/ulp/ipoib/ipoib_cm.c     2008-04-23 15:16:23.000000000-0400
> > ***************
> > *** 622,627 ****
> > --- 622,628 ----
> >                         skb_copy_from_linear_data_offset(skb, IPOIB_ENCAP_LEN,
> >                                                          small_skb->data, dlen);
> >                         skb_put(small_skb, dlen);
> > +                       skb_reset_mac_header(small_skb);
> >                         skb = small_skb;
> >                         goto copied;
> >                 }
> Hi Kris,
> 
> Good catch. This code does not exist in the mainline kernel and was 
> added through the ofed 1.3 (non) process, see the patch below. Does this 
> bug hits you for --every-- small packet received with connected mode? if 
> not, can you explain why?
> 
> The patch for itself as provided by the ofed sources (kernel_patches/fixes/ipoib_0320_small_skb_copy.patch) is event not documented, I took the change log from the git used to store it. This not reviewed and not documented patch which has a bug who could have been found if reviewed is yet another good example why code should not be added through ofed but rather through the mainline cycle, etc, oh well.
> 
> Or.
> > commit 92557c139fd8329daf1a1bf8beeaa6ae940b055a
> > Author: Eli Cohen <eli at mtls03.(none)>
> > Date:   Mon Feb 18 21:56:03 2008 +0200
> >
> >     IB/ipoib: copy small SKBs in CM mode
> >
> >     CM mode handling of received packets involves iterating trough the fragments.
> >     This is time consuming and in case of small packets it is better to allocate
> >     a new small skb and copy the data and pass this smaller SKB up to the IP stack.
> >
> >     Signed-off-by: Eli Cohen <eli at mellanox.co.il>
> >
> > Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
> > ===================================================================
> > --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib.h	2008-02-18 19:23:23.000000000 +0200
> > +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h	2008-02-18 22:20:48.000000000 +0200
> > @@ -99,6 +99,8 @@ enum {
> >  	MAX_SEND_CQE              = 16,
> >  	UD_POST_RCV_COUNT         = 16,
> >  	CM_POST_SRQ_COUNT         = 16,
> > +
> > +	SKB_TSHOLD		  = 256,
> >  };
> >
> >  #define	IPOIB_OP_RECV   (1ul << 31)
> > Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> > ===================================================================
> > --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-18 19:23:23.000000000 +0200
> > +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-18 22:21:01.000000000 +0200
> > @@ -554,6 +554,7 @@ void ipoib_cm_handle_rx_wc(struct net_de
> >  	u64 mapping[IPOIB_CM_RX_SG];
> >  	int frags;
> >  	int has_srq;
> > +	struct sk_buff *small_skb;
> >
> >  	ipoib_dbg_data(priv, "cm recv completion: id %d, status: %d\n",
> >  		       wr_id, wc->status);
> > @@ -608,6 +609,20 @@ void ipoib_cm_handle_rx_wc(struct net_de
> >  		}
> >  	}
> >
> > +	if (wc->byte_len < SKB_TSHOLD) {
> > +		int dlen = wc->byte_len - IPOIB_ENCAP_LEN;
> > +
> > +		small_skb = dev_alloc_skb(dlen);
> > +		if (small_skb) {
> > +			small_skb->protocol = ((struct ipoib_header *)skb->data)->proto;
> > +			skb_copy_from_linear_data_offset(skb, IPOIB_ENCAP_LEN,
> > +							 small_skb->data, dlen);
> > +			skb_put(small_skb, dlen);
> > +			skb = small_skb;
> > +			goto copied;
> > +		}
> > +	}
> > +
> >  	frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len,
> >  					      (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE;
> >
> > @@ -634,6 +649,7 @@ void ipoib_cm_handle_rx_wc(struct net_de
> >  	skb_reset_mac_header(skb);
> >  	skb_pull(skb, IPOIB_ENCAP_LEN);
> >
> > +copied:
> >  	dev->last_rx = jiffies;
> >  	++dev->stats.rx_packets;
> >  	dev->stats.rx_bytes += skb->len;
> >   
> 
> 
> 
> 
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg




More information about the ewg mailing list