[ofa-general] Re: [PATCH v2] IB/ipoib: Split CQs for IPOIB UD

Eli Cohen eli at dev.mellanox.co.il
Wed Apr 30 09:06:26 PDT 2008


On Tue, 2008-04-29 at 14:49 -0700, Roland Dreier wrote:
> By the way, this isn't just theoretical -- I'm not smart enough to
> realize this except that I just saw:
> 
>     ib1: TX ring full, stopping kernel net queue
>     NETDEV WATCHDOG: ib1: transmit timed out
>     ib1: transmit timeout: latency 1240 msecs
>     ib1: queue stopped 1, tx_head 5291313, tx_tail 5291255
> 
> and of course it never recovers.

I started working on a fix for this by arming the send CQ when the QP
reaches 63 outstanding requests and draining the CQ at the completion
handler while holding priv->tx_lock.

But I had another strange problem that I don't understand. If I just
load and unload ib_ipoib, the system crashes showing messages that
appear like there has been a memory corruption. If I comment out
destroying the send CQ at ipoib_transport_dev_cleanup() the crashes
disappear. Do you see this as well?




More information about the general mailing list