[ofa-general] Re: [PATCH v2] IB/ipoib: Split CQs for IPOIB UD
    Eli Cohen 
    eli at dev.mellanox.co.il
       
    Wed Apr 30 09:06:26 PDT 2008
    
    
  
On Tue, 2008-04-29 at 14:49 -0700, Roland Dreier wrote:
> By the way, this isn't just theoretical -- I'm not smart enough to
> realize this except that I just saw:
> 
>     ib1: TX ring full, stopping kernel net queue
>     NETDEV WATCHDOG: ib1: transmit timed out
>     ib1: transmit timeout: latency 1240 msecs
>     ib1: queue stopped 1, tx_head 5291313, tx_tail 5291255
> 
> and of course it never recovers.
I started working on a fix for this by arming the send CQ when the QP
reaches 63 outstanding requests and draining the CQ at the completion
handler while holding priv->tx_lock.
But I had another strange problem that I don't understand. If I just
load and unload ib_ipoib, the system crashes showing messages that
appear like there has been a memory corruption. If I comment out
destroying the send CQ at ipoib_transport_dev_cleanup() the crashes
disappear. Do you see this as well?
    
    
More information about the general
mailing list