[ofw] RE: HCA Soft Reset mechanism

Jan Bottorff jbottorff at xsigo.com
Mon Aug 4 17:44:08 PDT 2008


>For mlx4, it seems like the bus driver could
>remove the PDO, wait for the device removal to complete, reset the
>adapter, then add back the PDO.  Maybe there's more to it than this,
but
>at least on the surface, this seems easy and works without changes to
>the rest of the stack.  I'm not sure how you'd handle mthca though, but
>maybe getting this to work for mthca isn't as important...?

You guys don't seem to be thinking in terms of fabric booted systems.
You will not be able to remove the system disk driver instance or any
parent of the system disk, just because the hca is having a problem. You
will need to fail outstanding I/O's, reset/restart the hca, and get
everybody back up and running, so the system disk driver can retry the
I/O. This needs to happen with zero potential for page faults, which
means zero PnP events that might cause page faults.

I've always assumed the goal of the OFW IB stack is to create servers
with ONLY an hca for I/O, although perhaps that's an incorrect
assumption. 

Jan
 

__________ Information from ESET NOD32 Antivirus, version of virus
signature database 3326 (20080804) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com
 




More information about the ofw mailing list