[ofa-general] Re: [RFC][PATCH] last wqe event handler patch
Shirley Ma
xma at us.ibm.com
Wed Jun 25 15:09:12 PDT 2008
Roland Dreier <rdreier at cisco.com> wrote on 06/25/2008 02:43:19 PM:
> Interesting... I wonder if it really is taking that long for everything
> to finish draining, or if the system is too busy so it sees a spurious
> timeout? The intention of all of this is that it should "never happen"
> unless the hardware really is stuck.
I guess the reason might be we have a large cluster, each node has 4 ports,
too many RC QPs in this set up. We saw QPs went to dead and 5 secs drain
didn't work.
> What exactly is causing the crash here?
You can ignore this for now, it's related to other patch not current code
level. I will explain it in drain WR post_send failure patch.
Please review the stale connection resource cleanup patch to see whether it
makes sense.
thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080625/79f3565f/attachment.html>
More information about the general
mailing list