[ofa-general] Re: OFED HA related question
Roland Dreier
rdreier at cisco.com
Wed May 16 13:26:00 PDT 2007
Changqing> Suppose I get IBV_EVENT_DEVICE_FATAL async event from
Changqing> the first HCA on my node, can I continue to call
Changqing> ibv_poll_cq() to get back all the work-requests I
Changqing> posted before ? or do I need to keep track these
Changqing> work-requests? I am afraid ibv_poll_cq() will return
Changqing> error by itself. Also can I call ibv_dereg_mr() to free
Changqing> the memory I registered to this HCA ?
Once you get a catastrophic error, all bets are off. Work request
processing is in an undetermined state, since basically the HCA
crashed in an unknown way. Polling CQs is probably not a good idea.
I guess you do need to deregister memory regions to unpin the memory
as part of your cleanup....
Changqing> If I continue to use the second HCA, does the failure
Changqing> of first HCA affect the operation of second HCA (from
Changqing> driver point of view) ?
No.
- R.
More information about the general
mailing list