[ofw] RE: HCA Soft Reset mechanism

Sean Hefty sean.hefty at intel.com
Mon Aug 4 11:07:42 PDT 2008


>Soft Reset here is HCA re-initialization without bus driver reloading.
>A reset can be initiated by clients (mlx4_eth, mlx4_hca) and/or driver
>(mlx4_bus).
>Driver issues reset upon card fatal error, which prevents the following
>work with the card.
>Clients may request the reset at any moment upon their will.

>From the perspective of a client, is the behavior any different than a
remove device followed by an add device?

>Clients have to register event callback after getting bus interface.
>
>When a reset event comes, the bus driver will:
>   - bar the following work with card, returning -EFAULT to all, but
>destroy_xx, commands;
>   - reset the card to stop incoming traffic (only in case of client-
>initiated reset);
>   - notify all registered clients about pending reset.


>Getting this notification clients have to:
>   - wait for all issued commands to end;
>   - reset its own clients, if any, and bar their work;
>   - release all the device resources, they were using till now;
>   - send "I'm reset-ready" notification to the bus driver;

How does this synchronize with a user unloading the driver or the entire
stack?

Also, this is a fairly difficult operation to pull off in an efficient
manner.

>The driver starts to perform device reset only after receiving the "I'm
>reset-ready" notifications from all the registered clients. It re-
>initializes the device and notifies all the clients.
>
>Having received this notification, clients have to:
>   - dereference the old bus interface;
>   - get the new interface from bus driver;
>   - register new event handler;
>   - resume/restart itself;
>   - wake up its own clients, if any;

Why do clients need to get new interfaces or re-register event handlers?

We should tie this sort of operation in with standard Windows PnP
operations.  It really sounds like out of band PnP add/remove device to
me.

- Sean




More information about the ofw mailing list