[ofa-general] IPoIB-UD post_send failures (OFED 1.3)

akepner at sgi.com akepner at sgi.com
Fri May 16 07:39:30 PDT 2008


On Fri, May 16, 2008 at 04:06:56PM +0300, Eli Cohen wrote:
> On Thu, 2008-05-15 at 15:34 -0700, akepner at sgi.com wrote:
> 
> > ib0: tx_outstanding 0x82 (ipoib_sendq_size 0x80)
> > ib0: tx_outstanding 0x83 (ipoib_sendq_size 0x80)
> > ib0: tx_outstanding 0x83 (ipoib_sendq_size 0x80)
> > ib0: tx_outstanding 0x84 (ipoib_sendq_size 0x80)
> > ....
> 
> This should not happen. Can you send the source files for ipoib which
> you're using (with the debug patches)?

Sure. I'll send them privately, and not spam the mail list 
with this. 

But I'll restate what I said earlier in this email thread - 
I don't think the root cause here is IPoIB. I think IPoIB is 
a victim when the card stops generating completions. We've 
seen what looks to be the *same* bug (send queue gets forever 
stuffed up) on both OFED 1.2 and OFED 1.3. The drivers in these 
two releases (I know you're well aware) are very different. 
The common element is MT25204.

> 
> > 
> > (We never call netif_stop_queue().)
> > 
> You mean you don't see it get called; you did not change the code so it
> won't be called, correct?

I didn't change things so that netif_stop_queue() wouldn't be 
called. 

-- 
Arthur




More information about the general mailing list