[ofa-general] Re: [RFC][PATCH] last wqe event handler patch
Roland Dreier
rdreier at cisco.com
Wed Jun 25 13:07:04 PDT 2008
> > Can you explain this change a little more? It seems quite likely that
> > we would get last WQE reached events for other states, such as
> > IPOIB_CM_RX_ERROR coming from ipoib_cm_dev_stop(), and I don't see how
> > things work if we make this change.
> >
> > - R.
>
> Hello Roland,
>
> If it's already in ERROR status, it will be processed through
> rx_error_list. In the case of ipoib_cm_dev_stop(), it will wait for 5 * HZ
> to be drained and then put into reap_list. In the case of IPoIB running
> status, I put a 60 * HZ timer for drain in the stale connection release
> patch.
But the 5 second timeout in ipoib_cm_dev_stop() is supposed to be an
exception when something gets wedged, just to avoid waiting forever. We
want to handle the last WQE reached events normally in most cases.
Would a better fix to add locking around the "assume HW is wedged"
code in ipoib_cm_dev_stop() to avoid problems if the 5 second timeout is
too short?
- R.
More information about the general
mailing list