[ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,

Wed Dec 5 23:49:20 PST 2007

Vu Pham wrote:
> My systems are RHEL 5.1 x86-64, 2 Sinai hcas, fw 1.2.0
> I setup bonding as follow:
> IPOIBBOND_ENABLE=yes
> IPOIB_BONDS=bond0
> bond0_IP=11.1.1.1
> bond0_SLAVEs=ib0,ib1
> in /etc/infiniband/openib.conf in order to start ib-bond automatically

Hi Vu,

Please note that in RH5 there's a native support for bonding 
configuration through the initscripts tools (network scripts, etc), see 
section 3.1.2 at the ib-bonding.txt document provided with the bonding 
package.

The persistency mechanism which you have used (eg through 
/etc/init.d/openibd and /etc/openib.conf) is there only for somehow OLD 
distributions for which there's no native (*) support for bonding 
configuration, actually I was thinking we wanted to remove it 
altogether, Moni?

(*) under RH4 the native support it broken for ipoib/bonding and hence 
we patched the some initscripts scripts.

> I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We 
> tested it with ib0 and ib1 (connected to different switch/fabric) been 
> on the same subnet (10.2.1.x, 255.255.255.0) and on different subnets 
> (10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there is the issue 
> of loosing communication between the servers if nodes have not been on 
> the same primary ib interface.

Generally speaking, I don't see the point in using bonding for 
--high-availability-- where each slave is connected to different fabric. 
This is b/c when there's fail-over in one system you need also the 
second system to fail-over, you would also not be able to count on local 
link detection mechanisms, since the remote node also must fail-over now 
even with his local link being perfectly fine. This is correct 
regardless of the interconnect type.

Am I missing something here regarding to your setup?

The question on usage case of bonding over separate fabrics have been 
brought to me several times and I gave this answer, no-one ever tried to 
educate me why its interesting, maybe you will do so...

Also what do you mean with "ib0 and ib1 been on the same/different 
subnets" its only the master device (eg bond0, bond1, etc) with has 
association/configuration with an IP subnet, correct?

> 1. original state: ib0's are the primary on both servers - pinging bond0 
> between the servers is fine

> 2. fail ib0 on one of the servers (ib1 become primary on this server) - 
> pinging bond0 between the servers fails
sure, b/c there's no reason for the remote bonding to issue fail-over

> 3. fail ib0 on the second server (ib1 become primary) - pinging bond0 
> between the servers is fine again
indeed.

Or.