[openib-general] APM: QP migration state change when failover triggered by hw

Sean Hefty mshefty at ichips.intel.com
Wed Aug 2 10:00:09 PDT 2006


Jack Morgenstein wrote:
> This could be a bit complicated.  For example, say there are two possible 
> paths.  After migration has occurred the first time, there is no guarantee 
> that the original path has become available again.

That's okay.  I would expect the code to be able to handle this.

> There is also a race condition here in your proposal -- the new Alt Path data 
> must be specified between the MIGRATED event and the 
> communication-established event on the migrated path (so that the LAP message 
> may be correctly sent to the remote node).

I'm not following you here.  The CM currently sends all messages along the 
primary path specified in the REQ, but saves alternate path information.  The CM 
needs to know when to begin using the alternate path.  The implementation to do 
this is missing.

> However, after the first migration occurs, you need to do the following:
> 1.  send a LAP packet to the remote node, containing the new alt path info.
> 2.  load NEW alt path information (ib_modify_qp, rts->rts), including remote 
> LID received in LAP packet.
> 3.  Rearm path migration (ib_modify_qp, rts->rts)
> 
> Are you certain that the above 3 steps have taken place?

I'm not sure that step 1 can occur if the primary path has failed.  The CM 
doesn't know to send future MADs out a different path.

- Sean




More information about the general mailing list