[ofa-general] ***SPAM*** Unable to get IPoIB working

Chuck Hartley hartlch14 at gmail.com
Mon Jan 26 06:24:28 PST 2009


Hello,

I just brought up two new machines with ConnectX adapters and have
been unable to get them to work using IPoIB.  They are the only two
hosts on the network and are connected via a Mellanox MTS3600 switch.
All are running OFED 1.4 on Fedora 9, and the latest firmware (2.6.0)
on the HCAs. I am unable to ping from one adapter to the other:
These are our first ConnectX HCAs.  We have a number of hosts using
InfiniHost HCAs, and IPoIB "just worked" without further configuration
after OFED installation.

# ping 172.16.0.70
PING 172.16.0.70 (172.16.0.70) 56(84) bytes of data.
>From 172.16.0.71 icmp_seq=2 Destination Host Unreachable

#  arp -n 172.16.0.70
Address                  HWtype  HWaddress           Flags Mask            Iface
172.16.0.70                      (incomplete)                              ib0

# ifconfig ib0
ib0       Link encap:InfiniBand  HWaddr
80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
          inet addr:172.16.0.71  Bcast:172.16.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:65520  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

The ifconfig output looks ok, but this output from ip seems to contradict it(?):

# ip addr show ib0
4: ib0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 65520 qdisc pfifo_fast
state DOWN qlen 256
    link/infiniband
80:00:04:04:fe:80:00:00:00:00:00:00:00:30:48:c6:4c:18:00:01 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.16.0.71/16 brd 172.16.255.255 scope global ib0

# saquery
NodeRecord dump:
        lid.....................0x5
        reserved................0x0
        base_version............0x1
        class_version...........0x1
        node_type...............Channel Adapter
        num_ports...............0x2
        sys_guid................0x0002c9030003360f
        node_guid...............0x0002c9030003360c
        port_guid...............0x0002c9030003360d
        partition_cap...........0x80
        device_id...............0x673C
        revision................0xA0
        port_num................0x1
        vendor_id...............0x2C9
        NodeDescription.........linux71 HCA-1
NodeRecord dump:
        lid.....................0x2
        reserved................0x0
        base_version............0x1
        class_version...........0x1
        node_type...............Switch
        num_ports...............0x24
        sys_guid................0x0002c9020040536b
        node_guid...............0x0002c90200405368
        port_guid...............0x0002c90200405368
        partition_cap...........0x8
        device_id...............0xBD36
        revision................0xA0
        port_num................0x1
        vendor_id...............0x2C9
        NodeDescription.........Infiniscale-IV Mellanox Technologies
NodeRecord dump:
        lid.....................0x4
        reserved................0x0
        base_version............0x1
        class_version...........0x1
        node_type...............Channel Adapter
        num_ports...............0x2
        sys_guid................0x0002c90300032de3
        node_guid...............0x0002c90300032de0
        port_guid...............0x0002c90300032de1
        partition_cap...........0x80
        device_id...............0x673C
        revision................0xA0
        port_num................0x1
        vendor_id...............0x2C9
        NodeDescription.........linux70 HCA-1


Any ideas on what is going on here?  Thanks for an help you can provide.

Chuck



More information about the general mailing list