[openib-general] Have I got something very wrong here?

Greg Rudd G.Rudd at isu.usyd.edu.au
Fri Sep 1 22:19:49 PDT 2006


Hi all sorry for sounding like a total tool on this list but after
upgrading one of my boxes to RHEL4 rel 4 and installing the
2.6.9-42.0.2.ELhugemem kernel my previous working ib interfaces defined
as ib0 and ib1 that used to be able to talk IP can no longer talk but
yet the interfaces can be brought up ok and starting to get some
interesting messages via the dmesg

ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib1: stopping interface
ib1: downing ib_dev
ib1: Freeing ah e88b1b20
ib1: All sends and receives done.
ip_tables: (C) 2000-2002 Netfilter core team
ib1: bringing up interface
ib1: Created ah e88cf960
ib0: Send unicast ARP to 002b

on bringing up the interfaces this message appears in the dmesg 

ib0: Start path record lookup for
fe80:0000:0000:0000:0013:21ff:ff75:3939
ib0: PathRec LID 0x002a for GID fe80:0000:0000:0000:0013:21ff:ff75:3939
ib0: Created ah e7f26600
ib0: created address handle e84f51c0 for LID 0x002a, SL 0
ib0: Send unicast ARP to 002a
ib0: Start path record lookup for
fe80:0000:0000:0000:0013:21ff:ff75:399d
ib0: PathRec LID 0x002b for GID fe80:0000:0000:0000:0013:21ff:ff75:399d
ib0: Created ah e88806c0
ib0: created address handle e88806a0 for LID 0x002b, SL 0


If I am correct redhat has totally changed the way how the infiniband
drivers work in RHEL4 4 

What it interesting is when you run /etc/init.d/openibd status I get the
following 

./openibd status

  HCA driver loaded

Configured devices:
ib0 ib1

Currently active devices:
ib0
ib1

The following modules are also loaded:

        ib_cm
        ib_sdp

I note that ib_ipoib  does not appear in this list but when you do an
lsmod it appears to be loaded into the kernel as shown below

[root at hippo init.d]# lsmod |grep -i ib
ib_sdp                 35153  0 
rdma_cm                26181  2 ib_sdp,rdma_ucm
ib_addr                11717  1 rdma_cm
ib_local_sa            15565  2 rdma_ucm,rdma_cm
findex                  8001  1 ib_local_sa
ib_mthca              132969  0 
ib_ipoib               50129  0 
ib_uverbs              40169  1 rdma_ucm
ib_umad                18929  0 
ib_ucm                 20549  0 
ib_sa                  17109  3 rdma_cm,ib_local_sa,ib_ipoib
ib_cm                  38444  2 rdma_cm,ib_ucm
ib_mad                 39385  5 ib_local_sa,ib_mthca,ib_umad,ib_sa,ib_cm
ib_core                49985  11
ib_sdp,rdma_cm,ib_local_sa,ib_mthca,ib_ipoib,ib_uverbs,ib_umad,ib_ucm,ib_sa,ib_cm,ib_mad


As to the infiniband rpms installed this is what I have at the moment.

kernel-ib-1.0-1
libmthca-1.0.2-1.i386
libsdp-0.9.0-1.i386
libibverbs-1.0.3-1.i386
libibverbs-utils-1.0.3-1.i386
libibcommon-1.0-1.i386
libibumad-1.0-1.i386
opensm-libs-1.2.0-1.i386
opensm-1.2.0-1.i386
libibcm-0.9.0-1.i386
libibmad-1.0-1.i386
openib-diags-1.0-1.i386
perftest-1.0-1.i386
tvflash-0.9.0-1.i386
srptools-0.0.4-1.i386
librdmacm-0.9.0-1.i386
mstflint-1.0-1.i386

To get the infiniband interfaces to work as they did before under
2-6.9-34 to work here as both ib0 and ib1 am I missing something very
simple in the way of rpms or a kernel module that not has been loaded.
Or is there something else happening here.
 


Extra details

ib0       Link encap:UNSPEC  HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.0.0.1  Bcast:10.255.255.255  Mask:255.0.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:850 errors:0 dropped:0 overruns:0 frame:0
          TX packets:920 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:47600 (46.4 KiB)  TX bytes:55256 (53.9 KiB)

ib1       Link encap:UNSPEC  HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.0.0.2  Bcast:10.255.255.255  Mask:255.0.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)


Copy of /etc/modprobe.conf

alias eth0 tg3
alias eth1 tg3
alias bond0 bonding options bonding mode=active-backup  miimon=100
alias scsi_hostadapter cciss
alias eth2 e1000
alias eth3 e1000
alias usb-controller ohci-hcd
alias ib0 ib_ipoib
alias ib1 ib_ipoib
alias net-pf-27 ib_sdp
options ib_ipoib debug_level=2
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx &&
{ /sbin
/modprobe -r --ignore-remove qla2xxx_conf; }

alias scsi_hostadapter1 qla2xxx_conf
alias scsi_hostadapter2 qla2xxx
alias scsi_hostadapter3 qla2300
alias scsi_hostadapter4 qla2400
alias scsi_hostadapter5 qla6312
options qla2xxx  ql2xmaxqdepth=16 qlport_down_retry=30
ql2xloginretrycount=16 ql
2xfailover=1 ql2xlbType=1 ql2xautorestore=0x80


ifcfg files in /etc/sysconfig/network-scripts

[root at hippo network-scripts]# more ifcfg-ib0 
DEVICE=ib0
BOOTPROTO=static
BROADCAST=10.255.255.255
IPADDR=10.0.0.1
NETMASK=255.0.0.0
ONBOOT=yes

[root at hippo network-scripts]# more ifcfg-ib1
DEVICE=ib1
BOOTPROTO=static
BROADCAST=10.255.255.255
IPADDR=10.0.0.2
NETMASK=255.0.0.0
ONBOOT=yes


Thanks in advance

-greg






More information about the general mailing list