[openib-general] problems on device/ports initialization

Sean Hefty mshefty at ichips.intel.com
Thu Sep 22 10:33:22 PDT 2005


Shirley Ma wrote:
>  > Although, I don't think that it's necessarily worth doing, since 
> errors should be very rare.
> Agree. Since Galaxy hits this problem on PPC, we've tried different 
> approaches to fix this problem. None of them work well. And it's hard to 
> change the existing architecture. So I would like to work on a patch to 
> address the error.

A patch for this would be accepted.  I should note that other modules, such as 
the CM, follow this same error recovery.  If an error occurs trying to 
initialize any of the ports, the entire device is not used by that module.

Ipoib appears to handle each port separately, however, so that a failure on one 
port does not mark the others as invalid.  Roland should know for certain, but 
at least that's the way the code looks to me.

- Sean



More information about the general mailing list