[openib-general] APM: QP migration state change when failover triggered by hw
Jack Morgenstein
jackm at mellanox.co.il
Wed Aug 2 00:43:16 PDT 2006
On Tuesday 01 August 2006 21:55, Sean Hefty wrote:
> > I am testing APM with kernel module which directly interfaces with
> >ib_verbs.ko and ib_cm.ko.
> >Yes, I do receive IB_MIG_MIGRATED event, but the QP's mig_state is not
> >actually changed to MIGRATED. So I had to do this from my module.
>
>
> There is a pending patch that was recently posted (dispatch communication
> establish event) that can be extended to pass path migration events to the
> ib_cm. The purpose of passing path migration events to the ib_cm would be
> limited to changing the path that future CM messages, and not related to QP
> transitions.
>
> - Sean
This could be a bit complicated. For example, say there are two possible
paths. After migration has occurred the first time, there is no guarantee
that the original path has become available again.
There is also a race condition here in your proposal -- the new Alt Path data
must be specified between the MIGRATED event and the
communication-established event on the migrated path (so that the LAP message
may be correctly sent to the remote node).
Babu, regarding the migration event that you are seeing, are you sure that it
is from the migration transition that does not occur? Possibly, the
problematic transition is the second one, which occurs after specifying a new
alternate path and rearming APM?
It seems more likely to me that the first transition does occur, since you
receive a MIG event on both sides, and since the alt path data is loaded by
you during the initial bringup of the RC QP pair(either at init->rtr, or at
rtr->rts). If you are receiving the MIGRATED event, the qp is already in the
migrated state.
However, after the first migration occurs, you need to do the following:
1. send a LAP packet to the remote node, containing the new alt path info.
2. load NEW alt path information (ib_modify_qp, rts->rts), including remote
LID received in LAP packet.
3. Rearm path migration (ib_modify_qp, rts->rts)
Are you certain that the above 3 steps have taken place?
Note that 1. and 2. above are a separate phase from 3., since the IB Spec
allows changing the alternate path while the QP is still armed, not just when
it has migrated.
- Jack
More information about the general
mailing list