[openib-general] Re: ib0: ipoib_ib_post_receive failed for buf 111 ib0: failed to allocate receive buffer
Helen Chen
hycsw at ca.sandia.gov
Thu Oct 13 16:38:12 PDT 2005
Roland,
>From rolandd at cisco.com Thu Oct 13 16:19:30 2005
>
> Helen> BTW, the state of the IPoIB network seemed fine after the
> Helen> failed test, nd the mthca counters are moving up nicely.
>
>Even on the server on3-ib?
Yes, even on the server on3-ib.
>
> Helen> Do you still think this is a crash of the HCA firmware?
> Helen> Should I call Mellanox?
>
>Not if IPoIB is working on the systems printing the TX time out
>messages. However, if everything stops working on one of your
>systems, then yes, an HCA crash is likely.
>
>I'm still a unclear on what is happening. Do you see TX time
>out messages on a particular server, but IPoIB and mthca counters
>still work fine on that same server? Or is it just the rest of the
>fabric that continues working?
>
Not in realtime. My observations were made after the fact. I supose
I can launch another test and watch the cunter in realtime if you
believe that is necessary?
>Thanks,
> Roland
Thank you so much for the speedy fix. I will apply the patch and
stress test it as soon as possible.
Helen :-)
More information about the general
mailing list