[openib-general] APM: SM port failover
Hal Rosenstock
halr at voltaire.com
Tue Jan 2 20:22:40 PST 2007
On Tue, 2007-01-02 at 21:49, Venkatesh Babu wrote:
> Let us say there are two nodes A and B. NodeA (passive side) passively
> listens for RC QP connection establishment requests and NodeB (active
> side) initiates the RC QP connection request with ib_send_cm_req(). When
> a port failure occurs on NodeA (passive side), it gets the event
> IB_EVENT_PORT_ERR locally. So it can call ib_modify_qp() for the RC QP
> to change the path_mig_state to IB_MIG_MIGRATED to use the alternate
> path. No problem here. But NodeB has to register with the OpenSM for the
> port failure event on NodeA, so that it can call ib_modify_qp() on the
> active side.
>
> This is working fine by using the interface ib_sa_serv_notice_hdlr()
> described in bug#159
> (https://staging.openfabrics.org/bugzilla/show_bug.cgi?id=159).
>
> Now the question is - what if NodeA is running OpenSM with port 1
> (sm_port=1), and that port fails (say cable disconnect). Then Node B can
> not receive any notification of the port failure even though it has
> registered for the notice handler with ib_sa_serv_notice_hdlr(), because
> sm_port is down.
>
> How can we handle the port failover in this scenario ?
The subnet is essentially running without an SM when that port is
disconnected. How about a backup SM for the subnet ?
-- Hal
> VBabu
More information about the general
mailing list