[openfabrics-ewg] [openib-general] OFED 1.1 RC7

Sharma, Karun ksharma at silverstorm.com
Mon Oct 9 22:28:09 PDT 2006


Hi Vlad:
 
I tried to bring up IPoIB HA on SLES 10 servers which have both HCA ports up. I have also modified ifcfg-ib1 file to have same IP address as on ib0 interface. So basicaly, both ifcfg-ib0 and ifcfg-ib1 are identical.
Then I started continuous ping. Even after downing ib0 interface, ping traffic doesn't switched to ib1 interface. Please have a look at some of the logs below. If you need any additional information, please let me know.
 
Also let me know if i have missed something out.
 
Thanks,
Karun
 
#######################################################################
ss22:~ # ipoib_ha.pl -p ib0 -s ib1 --with-arping -v
Date:Tue Oct 10 10:38:43 2006
ib0:
======================================
BOOTPROTO = static
WIRELESS = no
REMOTE_IPADDR = 
status = 
HA = 0
DEVICE = ib0
NETMASK = 255.255.255.0
BROADCAST = 172.20.51.255
STARTMODE = onboot
IPADDR = 172.20.51.222
NETWORK = 172.20.51.0
Date:Tue Oct 10 10:38:43 2006
Bond:
======================================
BOOTPROTO = static
WIRELESS = no
REMOTE_IPADDR = 
status = 
HA = 0
DEVICE = ib0
NETMASK = 255.255.255.0
BROADCAST = 172.20.51.255
STARTMODE = onboot
IPADDR = 172.20.51.222
NETWORK = 172.20.51.0
Got CARRIER-ON event on ib0.
Got CARRIER-ON event on ib0.
---->>> Here I down and up ib0 interface.
Got CARRIER-ON event on ib0.
Got CARRIER-ON event on ib0.
Got CARRIER-ON event on ib0.

#############################################
ss22:~ # cat /etc/sysconfig/network/ifcfg-ib0
BOOTPROTO='static'
IPADDR='172.20.51.222'
NETMASK='255.255.255.0'
NETWORK='172.20.51.0'
BROADCAST='172.20.51.255'
REMOTE_IPADDR=''
STARTMODE='onboot'
WIRELESS='no'
 
ss22:~ # cat /etc/sysconfig/network/ifcfg-ib1
BOOTPROTO='static'
IPADDR='172.20.51.222'
NETMASK='255.255.255.0'
NETWORK='172.20.51.0'
BROADCAST='172.20.51.255'
REMOTE_IPADDR=''
STARTMODE='onboot'
WIRELESS='no'
ss22:~ # 
#############################################
ss22:~ # ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B1:43:38  
          inet addr:172.20.50.222  Bcast:172.20.50.255  Mask:255.255.255.0
          inet6 addr: fe80::204:23ff:feb1:4338/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:396 errors:0 dropped:0 overruns:0 frame:0
          TX packets:388 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:48056 (46.9 Kb)  TX bytes:49190 (48.0 Kb)
          Base address:0xdc00 Memory:fcfa0000-fcfc0000 
ib0       Link encap:UNSPEC  HWaddr 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet addr:172.20.51.222  Bcast:172.20.51.255  Mask:255.255.255.0
          inet6 addr: fe80::206:6a00:a000:399/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:48 errors:0 dropped:0 overruns:0 frame:0
          TX packets:62 errors:0 dropped:1 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:3752 (3.6 Kb)  TX bytes:5052 (4.9 Kb)
ib1       Link encap:UNSPEC  HWaddr 00-00-04-05-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet6 addr: fe80::206:6a01:a000:399/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:5 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:296 (296.0 b)  TX bytes:456 (456.0 b)
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:10 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:652 (652.0 b)  TX bytes:652 (652.0 b)
ss22:~ # 

##############################################################################


________________________________

From: Vladimir Sokolovsky [mailto:vlad at mellanox.co.il]
Sent: Mon 10/9/2006 9:21 AM
To: Sharma, Karun
Cc: EWG
Subject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 RC7



Hi Karun,
Both HCA ports should be connected to the same IB subnet.


Regards,
Vladimir

On Mon, 2006-10-09 at 07:58 -0400, Sharma, Karun wrote:
> Hi
> 
> I think that I am seeing bug # 247 with RC7.
> 
> I configured ipoib-ha as specified in release notes on RHEL4 up3
> servers (x86_64 machines).
> I started ping from one server. Simultaneously i executed ipoib_ha.pl
> script (see below). Then I downed the ib0 interface and expected that
> ping should recover after sometime. But ping never recovered. Then
> I brought up ib0 interface again. Ping got recovered and was
> successful.
> 
> Please note that I have only 1 HCA port active. Do we need to have
> both the HCA ports to be up? Even with 1 HCA port, I am able to add
> and configure both ib0 and ib1 interfaces. Is it a valid
> configuration? Do we have any mapping between HCA ports and ib
> interfaces?
> 
> Thanks
> Karun
> 
> ############################################################ 
> [root at st70 ~]# ipoib_ha.pl -p ib0 -s ib1 --with-arping -v
> Date:Mon Oct  9 07:23:54 2006
> ib0:
> ======================================
> BOOTPROTO = static
> status =
> HA = 0
> DEVICE = ib0
> NETMASK = 255.255.240.0
> BROADCAST = 172.26.16.255
> IPADDR = 172.26.16.70
> NETWORK = 172.26.0.0
> ONBOOT = yes
> Date:Mon Oct  9 07:23:54 2006
> Bond:
> ======================================
> BOOTPROTO = static
> status =
> HA = 0
> DEVICE = ib0
> NETMASK = 255.255.240.0
> BROADCAST = 172.26.16.255
> IPADDR = 172.26.16.70
> NETWORK = 172.26.0.0
> ONBOOT = yes
> Got CARRIER-ON event on ib0.
> Got CARRIER-ON event on ib0.
> Got NO-CARRIER event on ib0.
> Got NO-CARRIER but ib0 is UP
> Interface ib0 is down.
> Currently Active : ib0
> Both interfaces are down
> Got CARRIER-ON event on ib0.
> migrate_conf: Migrating from ib1 to ib0
> Got CARRIER-ON event on ib0.
>
> ################################################################
>
> ______________________________________________________________________
> From: openib-general-bounces at openib.org on behalf of Aviram Gutman
> Sent: Thu 10/5/2006 11:39 AM
> To: EWG
> Cc: Openib-General at Openib.Org
> Subject: [openib-general] OFED 1.1 RC7
>
>
> OFED-1.1-rc7 is available on
> https://openib.org/svn/gen2/branches/1.1/ofed/releases/
> File: OFED-1.1-rc7.tgz
> Please report any issues in bugzilla http://openib.org/bugzilla/
>
>
> Release details:
> ================
> BUILD_ID:
> OFED-1.1-rc7
>
> openib-1.1 (REV=9725)
> # User space
> https://openib.org/svn/gen2/branches/1.1/src/userspace
> Git: git://www.mellanox.co.il/~git/infinibandref: refs/heads/ofed_1_1
> ref: refs/heads/ofed_1_1
> commit fde99a7a22e56d6aa90dae9db3d600755efcedb5
>
> # MPI
> mpi_osu-0.9.7-mlx2.2.0.tgz
> openmpi-1.1.1-1.src.rpm
> mpitests-2.0-0.src.rpm
>
> Bug fixes from OFED-1.1-rc6:
> ===========================
> IPoIB HA:
>     BUG 247: OFED IPoIB HA not working on RHEL4 U3
>     BUG 259: problems with OFED IPoIB HA on SLES10
>
> IPATH:
>     BUG 252: Failed to load ib_ipath module (IPATH device is not
> present)
>
> EHCA:
>     BUG 250: libehca is not selectable although ib_ehca was selected
>
> SRP HA:
>     Use port_guid instead of node_guid.
>     Allows the user to set the identifier_extension when providing the
>     target attributes.
>
> ibutils:
>     BUG 243: ibutils/ibis build fails on SLES 10 / PPC64
>
> openib diags:
>     BUG 241: Diags build fails on SLES 10 PPC64
>
> Open MPI:
>     Fixed compilation issue on SLES10 PPC64
>
> mstflint :
>     SLES10 ppc workaround
>
>  Known issues:
> =============
>
> 1. IPoIB HA does not migrate IPoIB pkey interfaces (BUG 260)
> 2. kernel-ib conflicts with kernel-smp (Used --force flag in kernel-ib
> RPM installation as a workaround) (BUG 255)
>
> Lets try to get a final release on Wed or Thu next week.
>
> Aviram
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>
>
>
> _______________________________________________
> openfabrics-ewg mailing list
> openfabrics-ewg at openib.org
> http://openib.org/mailman/listinfo/openfabrics-ewg


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20061010/6068e7be/attachment.html>


More information about the ewg mailing list