[ofa-general] Re: potential device removal deadlock

Roland Dreier rdreier at cisco.com
Wed Jan 28 15:20:08 PST 2009


 > How could we fix this in the kernel? Perhaps ib_uverbs should post an
 > async error analgous to RDMA_CM_EVENT_DEVICE_REMOVAL?
 > 
 > Maybe IB_EVENT_DEVICE_FATAL?
 > 
 > In the case of EEH support of iw_cxgb3, I guess the driver could post
 > this event. That would at least kick all the user apps...

Having the low-level driver generate the fatal event is in fact what
mthca and mlx4 do right now... there's a certain asymmetry between IB
drivers (where RDMA CM is optional) and iWARP drivers (where RDMA CM is
mandatory), but the IB async event is the only thing that IB LLDs can
do.  I guess it would make sense for cxgb3 to do the same thing.

 - R.



More information about the general mailing list