[ofa-general] [PATCH] ipoib: null tx/rx_ring skb pointers on free

Pradeep Satyanarayana pradeeps at linux.vnet.ibm.com
Fri Nov 7 08:47:03 PST 2008


akepner at sgi.com wrote:
> On Thu, Nov 06, 2008 at 05:12:50PM +0200, Jack Morgenstein wrote:
>> On Thursday 06 November 2008 03:23, akepner at sgi.com wrote:
>>> I described an IPoIB-related panic we were seeing on large 
>>> clusters. The signature was a backtrace like this:
>>>
>>>         skb_over_panic
>>>         :ib_ipoib:ipoib_ib_handle_rx_wc
>>>         :ib_ipoib:ipoib_poll
>>>         net_rx_action
>>>         .....
>>>
>>> The bug is difficult to reproduce, but we finally got a crashdump, 
>>> and the problem appears to be that stale skb pointers on the tx_ring 
>>> were left pointing to skbs that had been since reused, so that the 
>>> skb's data region was now unexpectedly short, etc. 
>>>
>> How does ipoib_ib_handle_rx_wc() involve the tx_ring? This is 
>> receive processing.
>>
> 
> What I surmise may be happening is something like this:
> 
> - tx skb is freed, but a stale pointer remains on tx_ring
> - the same skb is reallocated, and added to the rx_ring
> - now we get an 'unexpected' tx completion, and use the stale 
>   skb pointer on the tx_ring to again free the skb (this step 
>   seems to invoke a f/w bug)
> - another driver, say an ethernet driver, reallocates the skb, 
>   reducing the extent of the data region (leading to the 
>   skb_over_panic once it's processed by ipoib_ib_handle_rx_wc)
> 
> 
> This bug leaves the tx and rx rings corrupted in many ways, 
> including:
> 
> - different rx_ring members refer to the same skb
> - different skbs on the rx_ring have identical data, head, end, tail ptrs
> - skbs on the rx_ring have sizes inconsistent with what the ipoib 
>   driver allocates (which causes the skb_over_panic, of course)
> - rx skbs have 'dev' pointers to ethernet devices 
> - dma mappings in rx_ring aren't consistent with what's in skb
> - some skbs are simultaneously on the tx and rx rings

If I am not mistaken we saw a problem that showed similar characteristics 
more than two years ago on IBM platforms. The same issue of rx_ring 
reusing tx_ring skbs and so on and would show up only under stress. This 
was with UD mode (before CM came into the picture) and it turned
out to be a driver issue. Could that be the same here?

Pradeep





More information about the general mailing list