[ofa-general] Re: ipoib / bonding and OFED

Scott Weitzenkamp (sweitzen) sweitzen at cisco.com
Wed Jun 6 11:55:28 PDT 2007


You should use openibd.conf, not ifcfg-*, for configuring bonding at
boot time.

Scott 

> -----Original Message-----
> From: Bob Kossey [mailto:bob.kossey at hp.com] 
> Sent: Wednesday, June 06, 2007 11:53 AM
> To: Scott Weitzenkamp (sweitzen)
> Cc: Or Gerlitz; OpenFabrics General
> Subject: Re: [ofa-general] Re: ipoib / bonding and OFED
> 
> Just to follow up on this, using RHEL5 and OFED 1.2 rc4, I was able
> to do enough rudimentary testing to convince myself that IB
> bonding was working.  I was able to use ib-bond, as well
> as the use of the openib.conf file to enable bonding on startup,
> including both(separately) IPOIBBOND_ENABLE and IPOIBHA_ENABLE.
> 
> One thing I was not able to do however, was to start IB bonding
> using the standard bonding modifications to /etc/modprobe.conf
> and /etc/sysconfig/network-scripts/ifcfg* files.  Should this 
> be possible,
> and are there perhaps some required settings I am missing?  I'll
> include my file modifications and some output below.
> 
> modprobe.conf:
> alias bond0 bonding
> options bond0 mode=active-backup miimon=100
> 
> ifcfg-bond0:
> DEVICE=bond0
> IPADDR="172.22.0.23"
> NETMASK="255.255.0.0"
> NETWORK="172.22.0.0"
> BROADCAST="172.22.255.255"
> ONBOOT=yes
> BOOTPROTO=none
> USERCTL=no
> BONDING_SLAVE0=ib0
> BONDING_SLAVE0=ib1
> 
> ifcfg-ib0:
> DEVICE=ib0
> USERCTL=no
> ONBOOT=yes
> MASTER=bond0
> SLAVE=yes
> BOOTPROTO=none
> 
> ifcfg-ib1:
> DEVICE=ib1
> USERCTL=no
> ONBOOT=yes
> MASTER=bond0
> SLAVE=yes
> BOOTPROTO=none
> 
> [root at njxc6-rhel5 ~]# ifconfig
> bond0     Link encap:InfiniBand  HWaddr 
> 80:00:04:05:FE:80:00:00:00:00:00:00:00:0
> 0:00:00:00:00:00:00
>          inet addr:172.22.0.23  Bcast:172.22.255.255  Mask:255.255.0.0
>          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:1352 (1.3 KiB)
> 
> dmesg:
> ...
> Ethernet Channel Bonding Driver: v3.1.1 (September 26, 2006)
> bonding: MII link monitoring set to 100 ms
> ADDRCONF(NETDEV_UP): bond0: link is not ready
> bonding: bond0: Adding slave ib0.
> bonding: bond0: Warning: enslaved VLAN challenged slave ib0. Adding 
> VLANs will b
> e blocked as long as ib0 is part of bond bond0
> bonding: bond0: Warning: The first slave device you specified 
> does not 
> support s
> etting the MAC address. This bond MAC address would be that of the 
> active slave.
> ADDRCONF(NETDEV_UP): ib0: link is not ready
> bonding: bond0: Warning: failed to get speed and duplex from ib0, 
> assumed to be
> 100Mb/sec and Full.
> bonding: bond0: making interface ib0 the new active one.
> bondingbond_send_grat_arp: bond bond0 slave ib0
> bonding: bond0: first active interface up!
> bonding: bond0: enslaving ib0 as an active interface with an up link.
> bonding: bond0: Adding slave ib1.
> bonding: bond0: Warning: enslaved VLAN challenged slave ib1. Adding 
> VLANs will b
> e blocked as long as ib1 is part of bond bond0
> ADDRCONF(NETDEV_UP): ib1: link is not ready
> bonding: bond0: Warning: failed to get speed and duplex from ib1, 
> assumed to be
> 100Mb/sec and Full.
> bonding: bond0: enslaving ib1 as a backup interface with an up link.
> ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
> ...
> bonding: bond0: Interface ib0 is already enslaved!
> ib0: enabling connected mode will cause multicast packet drops
> ib0: mtu > 2044 will cause multicast packet drops.
> bonding: bond0: link status definitely down for interface 
> ib0, disabling it
> bonding: bond0: making interface ib1 the new active one.
> bondingbond_send_grat_arp: bond bond0 slave ib1
> bonding: bond0: Interface ib1 is already enslaved!
> ib1: enabling connected mode will cause multicast packet drops
> ib1: mtu > 2044 will cause multicast packet drops.
> bonding: bond0: link status definitely down for interface 
> ib1, disabling it
> bondingbond_send_grat_arp: bond bond0 slave NULL
> bonding: bond0: now running without any active interface !
> 
> Thanks,
> Bob
> 
> Scott Weitzenkamp (sweitzen) wrote:
> > Bob, it is now possible to configure IPoIB bonding in
> > /etc/infiniband/openib.conf, this configuration file includes the
> > following boilerplate.
> >
> > # Enable the bonding driver on startup
> > IPOIBBOND_ENABLE=no
> > # Set bond interface names
> > #IPOIB_BONDS=bond0,bond1
> > # Set specific bond params; address and slaves
> > #bond0_IP=10.10.10.1
> > #bond0_SLAVES=ib0,ib1
> > #bond1_IP=20.10.10.1
> > #bond1_SLAVES=ib2,ib3,ib4
> >
> > Scott Weitzenkamp
> > SQA and Release Manager
> > Server Virtualization Business Unit
> > Cisco Systems
> >  
> >
> >   
> >> -----Original Message-----
> >> From: general-bounces at lists.openfabrics.org 
> >> [mailto:general-bounces at lists.openfabrics.org] On Behalf 
> Of Or Gerlitz
> >> Sent: Tuesday, May 29, 2007 12:56 AM
> >> To: Bob Kossey
> >> Cc: OpenFabrics General
> >> Subject: [ofa-general] Re: ipoib / bonding and OFED
> >>
> >> Bob Kossey wrote:
> >>     
> >>> I copied OR since I think this is related to his OFED HA work, and
> >>> he might have some insights.  A few more questions for Or:
> >>> I was trying to use ipoib bonding with OFED 1.2 rc2 and a 
> >>>       
> >> 2.6.9 kernel,
> >>     
> >>> but was not able to get it to work so far.  I saw your 
> >>>       
> >> Sonoma bonding
> >>     
> >>> slides, and you mention kernel bonding driver changes were needed.
> >>> 2. Is there a minimum kernel version, with the kernel 
> bonding driver
> >>> changes, that is required to use bonding with OFED ipoib?
> >>>       
> >> Just to have a base line here: to get bonding to work with 
> IPoIB, you 
> >> should use the bonding driver provided with OFED 1.2. This 
> >> driver is the 
> >>   upstream one (of 2.6.20) being patched to support IPoIB and 
> >> backported 
> >> to RH5, SLES10 and RH4 U3/4/5, other kernels are not supported.
> >>
> >> If you were using the ofed bonding on a system that matches 
> >> the support 
> >> matrix it should worl. If do have problems under this 
> config, please 
> >> either open a bug at the ofed bugzilla
> >> @ bugs.openfabrics.org assigned to monis at voltaire.com 
> (Moni Shoua) or 
> >> send first report/question to Moni and CC ewg at lists.openfabrics.org
> >>
> >> Please note that between RC2 and RC4 (to be released today 
> etc) some 
> >> bugs were fixed, you can search in the bugzilla to see what.
> >>
> >>     
> >>> 3. The bonding driver uses the HWADDR from the underlying ipoib
> >>> devices, how does it obtain the HWADDR?  Does it use the 
> >>>       
> >> full 20 bytes,
> >>     
> >>> or some subset?
> >>>       
> >> when enslaving IPoIB devices, the bonding driver uses the full hw 
> >> address of the active slave, it simply looks on the 
> dev_addr field of 
> >> the slave struct netdevice (see include/linux/netdevice.h)
> >>
> >>     
> >>> 4. What use_carrier options for link status detection does 
> >>>       
> >> OFED ipoib 
> >>     
> >>> support,
> >>> MII, ETHTOOL or netif_carrier_ok?
> >>>       
> >> the mii/ethertool etc local link detection methods of the 
> >> bonding driver 
> >>   are somehow deprecated, since nowadays almost any network device 
> >> support the netif_carrier_ok call. The --default-- of the upstream 
> >> bonding driver (eg the one we use in OFED and the 2.6.21 
> >> listed below) 
> >> is to set the use_carrier mod param to 1 that is mii is not 
> >> used anymore.
> >>
> >>     
> >>> author:         Thomas Davis, tadavis at lbl.gov and many others
> >>> description:    Ethernet Channel Bonding Driver, v3.1.2
> >>> version:        3.1.2
> >>> parm:           use_carrier:Use netif_carrier_ok (vs MII 
> >>>       
> >> ioctls) in miimon; 0 for off, 1 for on (default) (int)
> >>     
> >>> parm:           miimon:Link check interval in milliseconds (int)
> >>>       
> >>> If you have any good examples of bonding configuration 
> >>>       
> >> settings that work
> >>     
> >>> with OFED, I'd appreciate that also.
> >>>       
> >> The bonding RPM provided with OFED is made of a driver, 
> >> script and some 
> >> help text containing usage examples, please take a look there 
> >> and let me 
> >> know if you have further questions.
> >>
> >>     
> >>> $ rpm -ql ib-bonding-0.9.0-2.6.9_42.ELsmp
> >>>
> >>>       
> >> /lib/modules/2.6.9-42.ELsmp/updates/kernel/drivers/net/bonding
> >> /bonding.ko
> >>     
> >>> /usr/bin/ib-bond
> >>> /usr/share/doc/ib-bonding-0.9.0/ib-bonding.txt
> >>>       
> >> The ofed service (/etc/init.d/openibd) was enhanced to allow for 
> >> --persistent-- bonding configuration, please see the bonding 
> >> section at
> >> docs/ipoib_release_notes.txt to see how to do it.
> >>
> >> Or.
> >>
> >> _______________________________________________
> >> general mailing list
> >> general at lists.openfabrics.org
> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>
> >> To unsubscribe, please visit 
> >> http://openib.org/mailman/listinfo/openib-general
> >>
> >>     
> 



More information about the general mailing list