[openib-general] APM: SM port failover

Hal Rosenstock halr at voltaire.com
Tue Jan 2 20:22:40 PST 2007


On Tue, 2007-01-02 at 21:49, Venkatesh Babu wrote:
>  Let us say there are two nodes A and B. NodeA (passive side) passively 
> listens for RC QP connection establishment requests and NodeB (active 
> side) initiates the RC QP connection request with ib_send_cm_req(). When 
> a port failure occurs on NodeA (passive side), it gets the event 
> IB_EVENT_PORT_ERR locally. So it can call ib_modify_qp() for the RC QP 
> to change the path_mig_state to IB_MIG_MIGRATED to use the alternate 
> path. No problem here. But NodeB has to register with the OpenSM for the 
> port failure event on NodeA, so that it can call ib_modify_qp() on the 
> active side.
> 
> This is working fine by using the interface ib_sa_serv_notice_hdlr() 
> described in bug#159 
> (https://staging.openfabrics.org/bugzilla/show_bug.cgi?id=159).
> 
>   Now the question is - what if NodeA is running OpenSM with port 1 
> (sm_port=1), and that port fails (say cable disconnect). Then Node B can 
> not receive any notification of the port failure even though it has 
> registered for the notice handler with ib_sa_serv_notice_hdlr(), because 
> sm_port is down.
> 
>   How can we handle the port failover in this scenario ?

The subnet is essentially running without an SM when that port is
disconnected. How about a backup SM for the subnet ?

-- Hal

>  VBabu





More information about the general mailing list