[openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

Sean Hefty sean.hefty at intel.com
Wed Nov 1 19:49:37 PST 2006


>Are these changes to replace ib_cm_init_rearm_attr() interface ?

Yes - you use ib_cm_init_qp_attr() to get the qp_attr after a loading a new
alternate path.  The new path is loaded using ib_send_cm_lap().  So, after a
path fails:

One side calls ib_send_cm_lap() to propose a new alternate path.
Second side responds by calling ib_send_cm_apr().
Both sides call ib_cm_init_qp_attr(), then ib_modify_qp() to load the new path.

This is intended to work if failover has occurred, or if the user detects that
the alternate path is down and wants to replace it.

There is an additional call, ib_cm_notify() which is used to let the CM know
that the primary path has failed, and the alternate path should be used when
sending future CM messages.  In case of failover, this needs to be called before
calling ib_send_cm_lap() to ensure that the LAP message reaches the remote user.

>The path migration from Primary to Alternate succeeded, then reloaded
>the alternate path.

How did you reload the alternate path?

>failed with the IB_WC_RETRY_EXC_ERR. But I got the event IB_EVENT_PATH_MIG.
>
>With the ib_cm_init_rearm_attr() being called, failover/failback worked
>fine.

Were you calling ib_send_cm_lap() to load a new alternate path, or just assuming
that the old path would work after failover occurred?

- Sean




More information about the general mailing list