[ofa-general] installing and running ofed3 on debian

Murray Smigel murray at tradeworx.com
Tue Mar 11 13:00:45 PDT 2008


Hi,
I am running debian etch on x86-64 hardware. I am using Mellanox 
ConnectX HCAs.
At the moment I just have two machines connected by a cable for testing.

I extracted the sources from the OFED-3.0 release SRPMS and built them.
I took a fresh 2.6.24 linux source tree and built a kernel after 
replacing the
drivers/infiniband and include/rdma directories with those from the 
ofa_kernel-1.3.
The build and install went smoothly.

After the boot with the new kernel:
nasnu2:/etc/udev/rules.d# dmesg | grep mlx
mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007)
mlx4_core: Initializing 0000:0b:00.0


modprobe ib_uverbs
modprobe ib_umad
modprobe ib_uverbs
modprobe ib_ucm

nasnu2:/usr/local/bin# lsmod | grep mlx
mlx4_core              72608  0

nasnu2:/usr/local/bin# lsmod | grep ib_
ib_ucm                 20040  0
ib_cm                  35240  1 ib_ucm
ib_sa                  25888  1 ib_cm
ib_uverbs              38192  1 ib_ucm
ib_umad                19496  0
ib_mad                 40356  3 ib_cm,ib_sa,ib_umad
ib_core                61504  6 ib_ucm,ib_cm,ib_sa,ib_uverbs,ib_umad,ib_mad

I created a udev script 90-ib.rules in /etc/udev/rules.d
KERNEL=="umad*", NAME="infiniband/%k"
KERNEL=="issm*", NAME="infiniband/%k"
KERNEL=="uverbs*", NAME="infiniband/%k", MODE="0666"
KERNEL=="ucm*", NAME="infiniband/%k", MODE="0666"
KERNEL=="rdma_cm", NAME="infiniband/%k", MODE="0666"

and restarted udev.

When I try to start opensm I get:
Mar 11 15:51:59 034721 [8D3B2320] 0x03 -> OpenSM 3.2.0
Mar 11 15:51:59 034777 [8D3B2320] 0x80 -> OpenSM 3.2.0
Mar 11 15:51:59 035360 [8D3B2320] 0x80 -> Entering DISCOVERING state
Mar 11 15:51:59 035484 [8D3B2320] 0x02 -> osm_vendor_bind: Binding to 
port 0x0
Mar 11 15:51:59 035519 [8D3B2320] 0x01 -> osm_vendor_open_port: ERR 
542A: umad_get_ca() failed
Mar 11 15:51:59 035529 [8D3B2320] 0x01 -> osm_vendor_bind: ERR 5424: 
Unable to open port 0x0
Mar 11 15:51:59 035538 [8D3B2320] 0x01 -> osm_sm_mad_ctrl_bind: ERR 
3118: Vendor specific bind failed
Mar 11 15:51:59 035546 [8D3B2320] 0x01 -> osm_sm_bind: ERR 2E10: SM MAD 
Controller bind failed (IB_ERROR)
Mar 11 15:51:59 035571 [8D3B2320] 0x01 -> osm_sa_mad_ctrl_unbind: ERR 
1A11: No previous bind
Mar 11 15:51:59 035873 [8D3B2320] 0x80 -> Exiting SM

Can you please give me some advice as to what I am missing?
Thanks,
murray smigel








More information about the general mailing list