[ofa-general] Re: ipoib / bonding and OFED

Bob Kossey bob.kossey at hp.com
Wed Jun 6 11:53:28 PDT 2007


Just to follow up on this, using RHEL5 and OFED 1.2 rc4, I was able
to do enough rudimentary testing to convince myself that IB
bonding was working.  I was able to use ib-bond, as well
as the use of the openib.conf file to enable bonding on startup,
including both(separately) IPOIBBOND_ENABLE and IPOIBHA_ENABLE.

One thing I was not able to do however, was to start IB bonding
using the standard bonding modifications to /etc/modprobe.conf
and /etc/sysconfig/network-scripts/ifcfg* files.  Should this be possible,
and are there perhaps some required settings I am missing?  I'll
include my file modifications and some output below.

modprobe.conf:
alias bond0 bonding
options bond0 mode=active-backup miimon=100

ifcfg-bond0:
DEVICE=bond0
IPADDR="172.22.0.23"
NETMASK="255.255.0.0"
NETWORK="172.22.0.0"
BROADCAST="172.22.255.255"
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_SLAVE0=ib0
BONDING_SLAVE0=ib1

ifcfg-ib0:
DEVICE=ib0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

ifcfg-ib1:
DEVICE=ib1
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

[root at njxc6-rhel5 ~]# ifconfig
bond0     Link encap:InfiniBand  HWaddr 
80:00:04:05:FE:80:00:00:00:00:00:00:00:0
0:00:00:00:00:00:00
         inet addr:172.22.0.23  Bcast:172.22.255.255  Mask:255.255.0.0
         UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
         RX packets:0 errors:0 dropped:0 overruns:0 frame:0
         TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:0
         RX bytes:0 (0.0 b)  TX bytes:1352 (1.3 KiB)

dmesg:
...
Ethernet Channel Bonding Driver: v3.1.1 (September 26, 2006)
bonding: MII link monitoring set to 100 ms
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Adding slave ib0.
bonding: bond0: Warning: enslaved VLAN challenged slave ib0. Adding 
VLANs will b
e blocked as long as ib0 is part of bond bond0
bonding: bond0: Warning: The first slave device you specified does not 
support s
etting the MAC address. This bond MAC address would be that of the 
active slave.
ADDRCONF(NETDEV_UP): ib0: link is not ready
bonding: bond0: Warning: failed to get speed and duplex from ib0, 
assumed to be
100Mb/sec and Full.
bonding: bond0: making interface ib0 the new active one.
bondingbond_send_grat_arp: bond bond0 slave ib0
bonding: bond0: first active interface up!
bonding: bond0: enslaving ib0 as an active interface with an up link.
bonding: bond0: Adding slave ib1.
bonding: bond0: Warning: enslaved VLAN challenged slave ib1. Adding 
VLANs will b
e blocked as long as ib1 is part of bond bond0
ADDRCONF(NETDEV_UP): ib1: link is not ready
bonding: bond0: Warning: failed to get speed and duplex from ib1, 
assumed to be
100Mb/sec and Full.
bonding: bond0: enslaving ib1 as a backup interface with an up link.
ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
...
bonding: bond0: Interface ib0 is already enslaved!
ib0: enabling connected mode will cause multicast packet drops
ib0: mtu > 2044 will cause multicast packet drops.
bonding: bond0: link status definitely down for interface ib0, disabling it
bonding: bond0: making interface ib1 the new active one.
bondingbond_send_grat_arp: bond bond0 slave ib1
bonding: bond0: Interface ib1 is already enslaved!
ib1: enabling connected mode will cause multicast packet drops
ib1: mtu > 2044 will cause multicast packet drops.
bonding: bond0: link status definitely down for interface ib1, disabling it
bondingbond_send_grat_arp: bond bond0 slave NULL
bonding: bond0: now running without any active interface !

Thanks,
Bob

Scott Weitzenkamp (sweitzen) wrote:
> Bob, it is now possible to configure IPoIB bonding in
> /etc/infiniband/openib.conf, this configuration file includes the
> following boilerplate.
>
> # Enable the bonding driver on startup
> IPOIBBOND_ENABLE=no
> # Set bond interface names
> #IPOIB_BONDS=bond0,bond1
> # Set specific bond params; address and slaves
> #bond0_IP=10.10.10.1
> #bond0_SLAVES=ib0,ib1
> #bond1_IP=20.10.10.1
> #bond1_SLAVES=ib2,ib3,ib4
>
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>  
>
>   
>> -----Original Message-----
>> From: general-bounces at lists.openfabrics.org 
>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Or Gerlitz
>> Sent: Tuesday, May 29, 2007 12:56 AM
>> To: Bob Kossey
>> Cc: OpenFabrics General
>> Subject: [ofa-general] Re: ipoib / bonding and OFED
>>
>> Bob Kossey wrote:
>>     
>>> I copied OR since I think this is related to his OFED HA work, and
>>> he might have some insights.  A few more questions for Or:
>>> I was trying to use ipoib bonding with OFED 1.2 rc2 and a 
>>>       
>> 2.6.9 kernel,
>>     
>>> but was not able to get it to work so far.  I saw your 
>>>       
>> Sonoma bonding
>>     
>>> slides, and you mention kernel bonding driver changes were needed.
>>> 2. Is there a minimum kernel version, with the kernel bonding driver
>>> changes, that is required to use bonding with OFED ipoib?
>>>       
>> Just to have a base line here: to get bonding to work with IPoIB, you 
>> should use the bonding driver provided with OFED 1.2. This 
>> driver is the 
>>   upstream one (of 2.6.20) being patched to support IPoIB and 
>> backported 
>> to RH5, SLES10 and RH4 U3/4/5, other kernels are not supported.
>>
>> If you were using the ofed bonding on a system that matches 
>> the support 
>> matrix it should worl. If do have problems under this config, please 
>> either open a bug at the ofed bugzilla
>> @ bugs.openfabrics.org assigned to monis at voltaire.com (Moni Shoua) or 
>> send first report/question to Moni and CC ewg at lists.openfabrics.org
>>
>> Please note that between RC2 and RC4 (to be released today etc) some 
>> bugs were fixed, you can search in the bugzilla to see what.
>>
>>     
>>> 3. The bonding driver uses the HWADDR from the underlying ipoib
>>> devices, how does it obtain the HWADDR?  Does it use the 
>>>       
>> full 20 bytes,
>>     
>>> or some subset?
>>>       
>> when enslaving IPoIB devices, the bonding driver uses the full hw 
>> address of the active slave, it simply looks on the dev_addr field of 
>> the slave struct netdevice (see include/linux/netdevice.h)
>>
>>     
>>> 4. What use_carrier options for link status detection does 
>>>       
>> OFED ipoib 
>>     
>>> support,
>>> MII, ETHTOOL or netif_carrier_ok?
>>>       
>> the mii/ethertool etc local link detection methods of the 
>> bonding driver 
>>   are somehow deprecated, since nowadays almost any network device 
>> support the netif_carrier_ok call. The --default-- of the upstream 
>> bonding driver (eg the one we use in OFED and the 2.6.21 
>> listed below) 
>> is to set the use_carrier mod param to 1 that is mii is not 
>> used anymore.
>>
>>     
>>> author:         Thomas Davis, tadavis at lbl.gov and many others
>>> description:    Ethernet Channel Bonding Driver, v3.1.2
>>> version:        3.1.2
>>> parm:           use_carrier:Use netif_carrier_ok (vs MII 
>>>       
>> ioctls) in miimon; 0 for off, 1 for on (default) (int)
>>     
>>> parm:           miimon:Link check interval in milliseconds (int)
>>>       
>>> If you have any good examples of bonding configuration 
>>>       
>> settings that work
>>     
>>> with OFED, I'd appreciate that also.
>>>       
>> The bonding RPM provided with OFED is made of a driver, 
>> script and some 
>> help text containing usage examples, please take a look there 
>> and let me 
>> know if you have further questions.
>>
>>     
>>> $ rpm -ql ib-bonding-0.9.0-2.6.9_42.ELsmp
>>>
>>>       
>> /lib/modules/2.6.9-42.ELsmp/updates/kernel/drivers/net/bonding
>> /bonding.ko
>>     
>>> /usr/bin/ib-bond
>>> /usr/share/doc/ib-bonding-0.9.0/ib-bonding.txt
>>>       
>> The ofed service (/etc/init.d/openibd) was enhanced to allow for 
>> --persistent-- bonding configuration, please see the bonding 
>> section at
>> docs/ipoib_release_notes.txt to see how to do it.
>>
>> Or.
>>
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit 
>> http://openib.org/mailman/listinfo/openib-general
>>
>>     





More information about the general mailing list