[openib-general] APM: SM port failover
Venkatesh Babu
venkatesh.babu at 3leafnetworks.com
Tue Jan 2 18:49:25 PST 2007
Let us say there are two nodes A and B. NodeA (passive side) passively
listens for RC QP connection establishment requests and NodeB (active
side) initiates the RC QP connection request with ib_send_cm_req(). When
a port failure occurs on NodeA (passive side), it gets the event
IB_EVENT_PORT_ERR locally. So it can call ib_modify_qp() for the RC QP
to change the path_mig_state to IB_MIG_MIGRATED to use the alternate
path. No problem here. But NodeB has to register with the OpenSM for the
port failure event on NodeA, so that it can call ib_modify_qp() on the
active side.
This is working fine by using the interface ib_sa_serv_notice_hdlr()
described in bug#159
(https://staging.openfabrics.org/bugzilla/show_bug.cgi?id=159).
Now the question is - what if NodeA is running OpenSM with port 1
(sm_port=1), and that port fails (say cable disconnect). Then Node B can
not receive any notification of the port failure even though it has
registered for the notice handler with ib_sa_serv_notice_hdlr(), because
sm_port is down.
How can we handle the port failover in this scenario ?
VBabu
More information about the general
mailing list