[ofa-general] RE: OFED HA related question
Tang, Changqing
changquing.tang at hp.com
Wed May 16 14:43:30 PDT 2007
>
> Changqing> Suppose I get IBV_EVENT_DEVICE_FATAL
> async event from
> Changqing> the first HCA on my node, can I continue to call
> Changqing> ibv_poll_cq() to get back all the work-requests I
> Changqing> posted before ? or do I need to keep track these
> Changqing> work-requests? I am afraid ibv_poll_cq() will return
> Changqing> error by itself. Also can I call ibv_dereg_mr() to free
> Changqing> the memory I registered to this HCA ?
>
> Once you get a catastrophic error, all bets are off. Work
> request processing is in an undetermined state, since
> basically the HCA crashed in an unknown way. Polling CQs is
> probably not a good idea.
> I guess you do need to deregister memory regions to unpin the
> memory as part of your cleanup....
Thanks. However, when catastrophic error occurs, there are some entries
in CQ,
can I continue to peek them using ibv_poll_cq() ?
Also does ibv_dereg_mr() work when fatal error occurs ?
--CQ
>
> Changqing> If I continue to use the second HCA,
> does the failure
> Changqing> of first HCA affect the operation of second HCA (from
> Changqing> driver point of view) ?
>
> No.
>
> - R.
>
More information about the general
mailing list