[openib-general] APM: QP migration state change when failover triggered by hw

Venkatesh Babu venkatesh.babu at 3leafnetworks.com
Mon Jul 31 19:19:42 PDT 2006


 I was testing the APM (Automatic Path Migration) functionality. I found 
that with OFED 1.0 doesn't support it yet and also it is not planned for 
OFED 1.1. It is interesting to know when this feature is going to be added.

 I found that with OFED 1.0, there are some bugs and missing components 
to support this feature. So I opened #160, #172 and #159 to track these.

 To answer your question -
Configuration1: Node1 and Node2 connected directly with two IB cables 
without switch
Configuration2: Node1 and Node 2 conneected through two switches for 
each port.
 Node1, port1 -> switch1 -> Node2, port1
 Node1, port2 -> switch2 -> Node2, port2

Node 1:
1. Call ib_cm_listen() to wait for connection requests
2. When a REQ message arrives create a RC QP and establish a connection
3. Setup callback handlers to receive packets.
4. Receive packets and verify it and drop it.
5. Event IB_MIG_MIGRATED received
6. Stopped receiving packets.

Node 2:
1. Create RC QP
2. Send REQ message to Node 1 to establish the connection (Load both 
primary and alternate paths)
3. Contineously send some packets
4. Simulate the port failure by unplugging the IB cable
5. Event IB_MIG_MIGRATED received

Actually I fixed this problem in Configuration1 by calling 
ib_modify_qp() to change the mig_state from IB_MIG_ARMED to 
IB_MIG_MIGRATED when IB_EVENT_PORT_ERR event occurrs. But with 
Configuration2, IB_EVENT_PORT_ERR event occurrs on a node1, failover to 
the alternate path doesn't work. The traffic stops. Because node1 
doesn't now when the IB_EVENT_PORT_ERR event occurred on Node2. This 
requires a interface similar to Gen1 interface 
tsIbSetOutofServiceNoticeHandler()

 VBabu

Jack Morgenstein wrote:

>On Wednesday 12 July 2006 23:20, Venkatesh Babu wrote:
>  
>
>> With OFED 1.0, when cable is removed from the port corresponding to the
>>primary path, CI sends an event IB_EVENT_PATH_MIG, but is not changeing
>>the state to "Migrated" and not migrating to the alternate path.  So the
>>traffic doesn't resume on the alternate path.
>>
>>    
>>
>
>Could you please describe your flow in more detail (including setup phase), 
>and, if possible, send us a small test program which illustrates your 
>problem?
>
>Thanks!
>- Jack
>  
>




More information about the general mailing list