[openib-general] Have I got something very wrong here?
Greg Rudd
G.Rudd at isu.usyd.edu.au
Fri Sep 1 22:19:49 PDT 2006
Hi all sorry for sounding like a total tool on this list but after
upgrading one of my boxes to RHEL4 rel 4 and installing the
2.6.9-42.0.2.ELhugemem kernel my previous working ib interfaces defined
as ib0 and ib1 that used to be able to talk IP can no longer talk but
yet the interfaces can be brought up ok and starting to get some
interesting messages via the dmesg
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib0: Send unicast ARP to 002b
ib1: stopping interface
ib1: downing ib_dev
ib1: Freeing ah e88b1b20
ib1: All sends and receives done.
ip_tables: (C) 2000-2002 Netfilter core team
ib1: bringing up interface
ib1: Created ah e88cf960
ib0: Send unicast ARP to 002b
on bringing up the interfaces this message appears in the dmesg
ib0: Start path record lookup for
fe80:0000:0000:0000:0013:21ff:ff75:3939
ib0: PathRec LID 0x002a for GID fe80:0000:0000:0000:0013:21ff:ff75:3939
ib0: Created ah e7f26600
ib0: created address handle e84f51c0 for LID 0x002a, SL 0
ib0: Send unicast ARP to 002a
ib0: Start path record lookup for
fe80:0000:0000:0000:0013:21ff:ff75:399d
ib0: PathRec LID 0x002b for GID fe80:0000:0000:0000:0013:21ff:ff75:399d
ib0: Created ah e88806c0
ib0: created address handle e88806a0 for LID 0x002b, SL 0
If I am correct redhat has totally changed the way how the infiniband
drivers work in RHEL4 4
What it interesting is when you run /etc/init.d/openibd status I get the
following
./openibd status
HCA driver loaded
Configured devices:
ib0 ib1
Currently active devices:
ib0
ib1
The following modules are also loaded:
ib_cm
ib_sdp
I note that ib_ipoib does not appear in this list but when you do an
lsmod it appears to be loaded into the kernel as shown below
[root at hippo init.d]# lsmod |grep -i ib
ib_sdp 35153 0
rdma_cm 26181 2 ib_sdp,rdma_ucm
ib_addr 11717 1 rdma_cm
ib_local_sa 15565 2 rdma_ucm,rdma_cm
findex 8001 1 ib_local_sa
ib_mthca 132969 0
ib_ipoib 50129 0
ib_uverbs 40169 1 rdma_ucm
ib_umad 18929 0
ib_ucm 20549 0
ib_sa 17109 3 rdma_cm,ib_local_sa,ib_ipoib
ib_cm 38444 2 rdma_cm,ib_ucm
ib_mad 39385 5 ib_local_sa,ib_mthca,ib_umad,ib_sa,ib_cm
ib_core 49985 11
ib_sdp,rdma_cm,ib_local_sa,ib_mthca,ib_ipoib,ib_uverbs,ib_umad,ib_ucm,ib_sa,ib_cm,ib_mad
As to the infiniband rpms installed this is what I have at the moment.
kernel-ib-1.0-1
libmthca-1.0.2-1.i386
libsdp-0.9.0-1.i386
libibverbs-1.0.3-1.i386
libibverbs-utils-1.0.3-1.i386
libibcommon-1.0-1.i386
libibumad-1.0-1.i386
opensm-libs-1.2.0-1.i386
opensm-1.2.0-1.i386
libibcm-0.9.0-1.i386
libibmad-1.0-1.i386
openib-diags-1.0-1.i386
perftest-1.0-1.i386
tvflash-0.9.0-1.i386
srptools-0.0.4-1.i386
librdmacm-0.9.0-1.i386
mstflint-1.0-1.i386
To get the infiniband interfaces to work as they did before under
2-6.9-34 to work here as both ib0 and ib1 am I missing something very
simple in the way of rpms or a kernel module that not has been loaded.
Or is there something else happening here.
Extra details
ib0 Link encap:UNSPEC HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.0.0.1 Bcast:10.255.255.255 Mask:255.0.0.0
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:850 errors:0 dropped:0 overruns:0 frame:0
TX packets:920 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:47600 (46.4 KiB) TX bytes:55256 (53.9 KiB)
ib1 Link encap:UNSPEC HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.0.0.2 Bcast:10.255.255.255 Mask:255.0.0.0
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Copy of /etc/modprobe.conf
alias eth0 tg3
alias eth1 tg3
alias bond0 bonding options bonding mode=active-backup miimon=100
alias scsi_hostadapter cciss
alias eth2 e1000
alias eth3 e1000
alias usb-controller ohci-hcd
alias ib0 ib_ipoib
alias ib1 ib_ipoib
alias net-pf-27 ib_sdp
options ib_ipoib debug_level=2
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx &&
{ /sbin
/modprobe -r --ignore-remove qla2xxx_conf; }
alias scsi_hostadapter1 qla2xxx_conf
alias scsi_hostadapter2 qla2xxx
alias scsi_hostadapter3 qla2300
alias scsi_hostadapter4 qla2400
alias scsi_hostadapter5 qla6312
options qla2xxx ql2xmaxqdepth=16 qlport_down_retry=30
ql2xloginretrycount=16 ql
2xfailover=1 ql2xlbType=1 ql2xautorestore=0x80
ifcfg files in /etc/sysconfig/network-scripts
[root at hippo network-scripts]# more ifcfg-ib0
DEVICE=ib0
BOOTPROTO=static
BROADCAST=10.255.255.255
IPADDR=10.0.0.1
NETMASK=255.0.0.0
ONBOOT=yes
[root at hippo network-scripts]# more ifcfg-ib1
DEVICE=ib1
BOOTPROTO=static
BROADCAST=10.255.255.255
IPADDR=10.0.0.2
NETMASK=255.0.0.0
ONBOOT=yes
Thanks in advance
-greg
More information about the general
mailing list