[ewg] IPoIB_HA not working properly with OFED1.2-alpha

Karun Sharma karun.sharma at qlogic.com
Mon Feb 19 01:50:58 PST 2007


Hi Vlad:
 
 I configured IPoIB HA with OFED1.2-alpha release and it is not working for me. I have configured IPoIB HA on a RHEL4up4 machine with both ports up. Before configuring IPoIB HA, both IB interfaces are able to ping the other machine.
 
Then I executed ipoib_ha.pl script and configured ib0 as primary and ib1 as secondary interface. The ip address of ib1 interface has gone and till this point the things seems to be working fine.
 
The problem starts when I pulled the IB cable connecting port1. I can see ib1 interface taking IP address of ib0 interface but ping doesn't work after that. Even if I reinsert the cable in port1, ping is not working. I have attached some logs below. I have raised a bug # 371 for the same.
 
Thanks
Karun
 
################################################################
[root at ss27 ~]# ibv_devinfo
hca_id: mthca0
        fw_ver:                         5.1.400
        node_guid:                      0006:6a00:9800:6b90
        sys_image_guid:                 0006:6a00:9800:6b90
        vendor_id:                      0x066a
        vendor_part_id:                 25218
        hw_ver:                         0xA0
        board_id:                       SS_0000000002
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 6
                        port_lid:               2
                        port_lmc:               0x00
                port:   2
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 6
                        port_lid:               3
                        port_lmc:               0x00
[root at ss27 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:A0:D1:E4:53:DA  
          inet addr:172.20.50.227  Bcast:172.20.50.255  Mask:255.255.255.0
          inet6 addr: fe80::2a0:d1ff:fee4:53da/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:125 errors:0 dropped:0 overruns:0 frame:0
          TX packets:115 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:17236 (16.8 KiB)  TX bytes:15347 (14.9 KiB)
          Interrupt:201 
ib0       Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:172.20.51.227  Bcast:172.20.51.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
ib1       Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:172.20.52.227  Bcast:172.20.52.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1543 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1543 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1648528 (1.5 MiB)  TX bytes:1648528 (1.5 MiB)
[root at ss27 ~]# ping 172.20.51.226 -c 1
PING 172.20.51.226 (172.20.51.226) 56(84) bytes of data.
64 bytes from 172.20.51.226: icmp_seq=0 ttl=64 time=1.44 ms
--- 172.20.51.226 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.442/1.442/1.442/0.000 ms, pipe 2
[root at ss27 ~]# ping 172.20.52.226 -c 1
PING 172.20.52.226 (172.20.52.226) 56(84) bytes of data.
64 bytes from 172.20.52.226: icmp_seq=0 ttl=64 time=1.67 ms
--- 172.20.52.226 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.671/1.671/1.671/0.000 ms, pipe 2
[root at ss27 ~]# 
 
[root at ss27 ~]# ipoib_ha.pl -p ib0 -s ib1 --with-arping -vv
get_cfg: Got /etc/sysconfig/network-scripts/ifcfg-ib0

Date:Mon Feb 19 02:32:22 2007
ib0:
======================================
BOOTPROTO = static
status = 
HA = 0
DEVICE = ib0
NETMASK = 255.255.255.0
BROADCAST = 172.20.51.255
IPADDR = 172.20.51.227
NETWORK = 172.20.51.0
ONBOOT = yes
pkey = ffff

Date:Mon Feb 19 02:32:22 2007
Bond:
======================================
BOOTPROTO = static
status = 
HA = 0
DEVICE = ib0
NETMASK = 255.255.255.0
BROADCAST = 172.20.51.255
IPADDR = 172.20.51.227
NETWORK = 172.20.51.0
ONBOOT = yes
pkey = ffff

Date:Mon Feb 19 02:32:23 2007
Got NO-CARRIER event on ib0.
Interface ib0 is down.
Currently Active : ib0
Other device: ib1 is UP
migrate_conf: Migrating from ib0 to ib1
Date:Mon Feb 19 02:33:37 2007
Date:Mon Feb 19 02:33:37 2007
set_up_bond: Going to set up ib1 with 172.20.51.227
set_up_bond: Arping ib1 172.20.51.227.
Got CARRIER-ON event on ib1.
Got CARRIER-ON event on ib1.
Got CARRIER-ON event on ib1.
Got NO-CARRIER event on ib0.
Interface ib0 is down.
Currently Active : ib1
Got CARRIER-ON event on ib1.
Got CARRIER-ON event on ib0.
Got CARRIER-ON event on ib0.
Got NO-CARRIER event on ib1.
Interface ib1 is down.
Currently Active : ib1
Other device: ib0 is UP
migrate_conf: Migrating from ib1 to ib0
Date:Mon Feb 19 02:35:48 2007
Date:Mon Feb 19 02:35:48 2007
set_up_bond: Going to set up ib0 with 172.20.51.227
set_up_bond: Arping ib0 172.20.51.227.
Got CARRIER-ON event on ib0.
Got CARRIER-ON event on ib0.
Got CARRIER-ON event on ib1.

[root at ss27 ~]# 
#######################################################
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20070219/5ad89540/attachment.html>


More information about the ewg mailing list