[ofa-general] installing and running ofed3 on debian
Murray Smigel
murray at tradeworx.com
Tue Mar 11 13:00:45 PDT 2008
Hi,
I am running debian etch on x86-64 hardware. I am using Mellanox
ConnectX HCAs.
At the moment I just have two machines connected by a cable for testing.
I extracted the sources from the OFED-3.0 release SRPMS and built them.
I took a fresh 2.6.24 linux source tree and built a kernel after
replacing the
drivers/infiniband and include/rdma directories with those from the
ofa_kernel-1.3.
The build and install went smoothly.
After the boot with the new kernel:
nasnu2:/etc/udev/rules.d# dmesg | grep mlx
mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007)
mlx4_core: Initializing 0000:0b:00.0
modprobe ib_uverbs
modprobe ib_umad
modprobe ib_uverbs
modprobe ib_ucm
nasnu2:/usr/local/bin# lsmod | grep mlx
mlx4_core 72608 0
nasnu2:/usr/local/bin# lsmod | grep ib_
ib_ucm 20040 0
ib_cm 35240 1 ib_ucm
ib_sa 25888 1 ib_cm
ib_uverbs 38192 1 ib_ucm
ib_umad 19496 0
ib_mad 40356 3 ib_cm,ib_sa,ib_umad
ib_core 61504 6 ib_ucm,ib_cm,ib_sa,ib_uverbs,ib_umad,ib_mad
I created a udev script 90-ib.rules in /etc/udev/rules.d
KERNEL=="umad*", NAME="infiniband/%k"
KERNEL=="issm*", NAME="infiniband/%k"
KERNEL=="uverbs*", NAME="infiniband/%k", MODE="0666"
KERNEL=="ucm*", NAME="infiniband/%k", MODE="0666"
KERNEL=="rdma_cm", NAME="infiniband/%k", MODE="0666"
and restarted udev.
When I try to start opensm I get:
Mar 11 15:51:59 034721 [8D3B2320] 0x03 -> OpenSM 3.2.0
Mar 11 15:51:59 034777 [8D3B2320] 0x80 -> OpenSM 3.2.0
Mar 11 15:51:59 035360 [8D3B2320] 0x80 -> Entering DISCOVERING state
Mar 11 15:51:59 035484 [8D3B2320] 0x02 -> osm_vendor_bind: Binding to
port 0x0
Mar 11 15:51:59 035519 [8D3B2320] 0x01 -> osm_vendor_open_port: ERR
542A: umad_get_ca() failed
Mar 11 15:51:59 035529 [8D3B2320] 0x01 -> osm_vendor_bind: ERR 5424:
Unable to open port 0x0
Mar 11 15:51:59 035538 [8D3B2320] 0x01 -> osm_sm_mad_ctrl_bind: ERR
3118: Vendor specific bind failed
Mar 11 15:51:59 035546 [8D3B2320] 0x01 -> osm_sm_bind: ERR 2E10: SM MAD
Controller bind failed (IB_ERROR)
Mar 11 15:51:59 035571 [8D3B2320] 0x01 -> osm_sa_mad_ctrl_unbind: ERR
1A11: No previous bind
Mar 11 15:51:59 035873 [8D3B2320] 0x80 -> Exiting SM
Can you please give me some advice as to what I am missing?
Thanks,
murray smigel
More information about the general
mailing list