[ofa-general] Problems with mlx4
Andrey Slepuhin
andrey.slepuhin at t-platforms.ru
Wed Jun 13 07:56:57 PDT 2007
Dear folks,
I just setup a test cluster using ConnectX cards, but I can not get link
up. I downloaded the kernel from
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
After inserting the modules I see that the card was initialized:
Jun 13 22:17:23 testnode1 kernel: mlx4_core: Mellanox ConnectX core
driver v0.01 (May 1, 2007)
Jun 13 22:17:23 testnode1 kernel: mlx4_core: Initializing 0000:07:00.0
Jun 13 22:17:23 testnode1 kernel: ACPI: PCI Interrupt 0000:07:00.0[A] ->
GSI 16 (level, low) -> IRQ 16
Jun 13 22:17:23 testnode1 kernel: PCI: Setting latency timer of device
0000:07:00.0 to 64
But the link remains in "DOWN" state:
testnode1:~ # /opt/ofed/bin/ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:0000:07a1
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 20 Gb/sec (4X DDR)
Infiniband device 'mlx4_0' port 2 status:
default gid: fe80:0000:0000:0000:0002:c903:0000:07a2
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 20 Gb/sec (4X DDR)
I tried different ports and cables but without success. Do you have any
idea what's going wrong?
The nodes configuration is:
Intel S5000PSL motherboard, 2xXeon 5345, 8GB RAM
All the nodes are connected to Flextronics (Mellanox) 24-port DDR switch.
I'm running SLES10 with the kernel from Roland's tree:
testnode1:~ # uname -a
Linux testnode1 2.6.22-rc3 #1 SMP Wed Jun 6 23:56:36 MSD 2007 x86_64
x86_64 x86_64 GNU/Linux
Any help will be much appreciated.
Thanks in advance,
Andrey
More information about the general
mailing list