[ofa-general] ***SPAM*** Unable to get IPoIB working
Sasha Khapyorsky
sashak at voltaire.com
Mon Jan 26 06:38:29 PST 2009
Hi Chuck,
On 09:24 Mon 26 Jan , Chuck Hartley wrote:
>
> I just brought up two new machines with ConnectX adapters and have
> been unable to get them to work using IPoIB. They are the only two
> hosts on the network and are connected via a Mellanox MTS3600 switch.
> All are running OFED 1.4 on Fedora 9, and the latest firmware (2.6.0)
> on the HCAs. I am unable to ping from one adapter to the other:
> These are our first ConnectX HCAs. We have a number of hosts using
> InfiniHost HCAs, and IPoIB "just worked" without further configuration
> after OFED installation.
>
> # ping 172.16.0.70
> PING 172.16.0.70 (172.16.0.70) 56(84) bytes of data.
> >From 172.16.0.71 icmp_seq=2 Destination Host Unreachable
>
> # arp -n 172.16.0.70
> Address HWtype HWaddress Flags Mask Iface
> 172.16.0.70 (incomplete) ib0
>
> # ifconfig ib0
> ib0 Link encap:InfiniBand HWaddr
> 80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> inet addr:172.16.0.71 Bcast:172.16.255.255 Mask:255.255.0.0
> UP BROADCAST MULTICAST MTU:65520 Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:256
> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>
> The ifconfig output looks ok, but this output from ip seems to contradict it(?):
>
> # ip addr show ib0
> 4: ib0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 65520 qdisc pfifo_fast
> state DOWN qlen 256
> link/infiniband
> 80:00:04:04:fe:80:00:00:00:00:00:00:00:30:48:c6:4c:18:00:01 brd
> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
> inet 172.16.0.71/16 brd 172.16.255.255 scope global ib0
>
> # saquery
> NodeRecord dump:
> lid.....................0x5
> reserved................0x0
> base_version............0x1
> class_version...........0x1
> node_type...............Channel Adapter
> num_ports...............0x2
> sys_guid................0x0002c9030003360f
> node_guid...............0x0002c9030003360c
> port_guid...............0x0002c9030003360d
> partition_cap...........0x80
> device_id...............0x673C
> revision................0xA0
> port_num................0x1
> vendor_id...............0x2C9
> NodeDescription.........linux71 HCA-1
> NodeRecord dump:
> lid.....................0x2
> reserved................0x0
> base_version............0x1
> class_version...........0x1
> node_type...............Switch
> num_ports...............0x24
> sys_guid................0x0002c9020040536b
> node_guid...............0x0002c90200405368
> port_guid...............0x0002c90200405368
> partition_cap...........0x8
> device_id...............0xBD36
> revision................0xA0
> port_num................0x1
> vendor_id...............0x2C9
> NodeDescription.........Infiniscale-IV Mellanox Technologies
> NodeRecord dump:
> lid.....................0x4
> reserved................0x0
> base_version............0x1
> class_version...........0x1
> node_type...............Channel Adapter
> num_ports...............0x2
> sys_guid................0x0002c90300032de3
> node_guid...............0x0002c90300032de0
> port_guid...............0x0002c90300032de1
> partition_cap...........0x80
> device_id...............0x673C
> revision................0xA0
> port_num................0x1
> vendor_id...............0x2C9
> NodeDescription.........linux70 HCA-1
>
>
> Any ideas on what is going on here? Thanks for an help you can provide.
Which SM are you using? Any errors in dmesg?
Look at ibnetdiscover output that all links are in good (speed/width)
state.
Sasha
More information about the general
mailing list