I was testing the IPoIB failover/failback using the bonding mechanism with Open SM running in the IB subnet. I observed that the failover does not reliably occur IB port is made down using "ibportstate" command. <br>
<br>The test steps I followed and test configuration is as follows :<br><br>Pings to an IPoIB destination were started over the bond0 interface(which is configured as mentioned below). Pings continue properly. Failover to ib1 does not occur when I disconnect port 1 (corresponding to ib0) using<br>
<br> $ ibportstate disable<br><br>command. <br><br>In log, I can see the messages<br><br> kernel: bonding: bond0: link status definitely down for interface ib0, disabling it<br> kernel: bonding: bond0: making interface ib1 the new active one.<br>
<br>But, the pings stop. Also, I noticed the process status which shows :<br><br> PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND<br> 61 2503 1 1 ? -1 D< 0 0:00 [ib_inform]<br>
61 2504 1 1 ? -1 D< 0 0:00 [local_sa]<br><br>Is this expected ?<br><br>/etc/infiniband/openib.conf<br><br>ONBOOT=yes<br>UCM_LOAD=no<br>RDMA_CM_LOAD=yes<br>RDMA_UCM_LOAD=yes<br>RENICE_IB_MAD=no<br>
MTHCA_LOAD=yes<br>IPOIB_LOAD=yes<br>SET_IPOIB_CM=yes<br>SDP_LOAD=yes<br>SRP_LOAD=no<br>SRPT_LOAD=no<br>RDS_LOAD=no<br>SRPHA_ENABLE=no<br><br>IPOIBBOND_ENABLE=yes<br>IPOIB_BONDS=bond0<br>bond0_IP=<a href="http://100.1.1.13">100.1.1.13</a><br>
bond0_SLAVES=ib0,ib1<br><br>Source IPoIB m/c (bonding enabled) : OFED-1.3-rc4, RHEL5, MT25208<br>Destination IPoIB m/c : OFED-1.3-rc4, SLES10, MT25208<br><br>I am pinging the IPoIB interface over a machine which is running OpenSM.<br>
<br>Has somebody tested this kind of scenario ever or I am missing something?