[ofa-general] Having trouble pingpong between two nodes.
Jeffrey Wong
jwong at datallegro.com
Wed Jun 6 11:29:39 PDT 2007
Hello,
I am trying to run a ibv_ud_pingpong between two nodes but I can't seem
to get them to communicate. I have used the ping command between the ib
interfaces and that works fine, but when I try to use the ibv_ud_ping
pong it says the following:
________________________________________________________________________
________
root at centos5:node1 ~]# ibv_ud_pingpong 193.168.10.254
local address: LID 0x0002, QPN 0x0f0406, PSN 0xb067dc
Couldn't connect to 193.168.10.254:18515
________________________________________________________________________
____
I have the subnet manager running on node2.
When I run the ibchecknet I get the following errors:
#warn: counter SymbolErrors = 65535 (threshold 10)
#warn: counter LinkDowned = 78 (threshold 10)
#warn: counter RcvSwRelayErrors = 261 (threshold 100)
#warn: counter XmtDiscards = 173 (threshold 100)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port all: FAILED
#warn: counter SymbolErrors = 65535 (threshold 10)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port 18: FAILED
# Checked Switch: nodeguid 0x0002c9010d26dc90 with failure
#warn: counter SymbolErrors = 65535 (threshold 10)
#warn: counter LinkDowned = 13 (threshold 10)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port 16: FAILED
#warn: counter SymbolErrors = 65535 (threshold 10)
#warn: counter LinkDowned = 13 (threshold 10)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port 15: FAILED
#warn: counter SymbolErrors = 65535 (threshold 10)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port 14: FAILED
#warn: counter SymbolErrors = 65535 (threshold 10)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port 13: FAILED
#warn: counter SymbolErrors = 65535 (threshold 10)
#warn: counter XmtDiscards = 173 (threshold 100)
Error check on lid 5 (MT47396 Infiniscale-III Mellanox Technologies)
port 17: FAILED
# Checking Ca: nodeguid 0x0002c9020020080c
# Checking Ca: nodeguid 0x0002c902002015c0
# Checking Ca: nodeguid 0x0002c9020020590c
## Summary: 4 nodes checked, 0 bad nodes found
## 12 ports checked, 0 bad ports found
## 6 ports have errors beyond threshold
________________________________________________________________________
_____
I am trying to ping from node 1 to node 2
1st node configuration:
ib0 Link encap:InfiniBand HWaddr
80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:193.168.10.1 Bcast:193.168.10.255
Mask:255.255.255.0
inet6 addr: fe80::202:c902:20:80d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:150 errors:0 dropped:0 overruns:0 frame:0
TX packets:37 errors:0 dropped:9 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:35356 (34.5 KiB) TX bytes:7624 (7.4 KiB)
ib1 Link encap:InfiniBand HWaddr
80:00:04:05:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:194.168.10.1 Bcast:194.168.10.255
Mask:255.255.255.0
inet6 addr: fe80::202:c902:20:80e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:148 errors:0 dropped:0 overruns:0 frame:0
TX packets:34 errors:0 dropped:9 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:35156 (34.3 KiB) TX bytes:7496 (7.3 KiB)
____________________________________________________________________
[root at centos5:node1 ~]# ibstat
CA 'mthca0'
CA type: MT25208
Number of ports: 2
Firmware version: 5.0.1
Hardware version: a0
Node GUID: 0x0002c9020020080c
System image GUID: 0x0002c9020020080f
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 2
LMC: 0
SM lid: 1
Capability mask: 0x00510a68
Port GUID: 0x0002c9020020080d
Port 2:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 3
LMC: 0
SM lid: 1
Capability mask: 0x00510a68
Port GUID: 0x0002c9020020080e
___________________________________________________________
Node 2
ib0 Link encap:InfiniBand HWaddr
80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:193.168.10.254 Bcast:193.168.10.255
Mask:255.255.255.0
inet6 addr: fe80::202:c902:20:590d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:102 errors:0 dropped:0 overruns:0 frame:0
TX packets:42 errors:0 dropped:9 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:23750 (23.1 KiB) TX bytes:8048 (7.8 KiB)
ib1 Link encap:InfiniBand HWaddr
80:00:04:05:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:194.168.10.254 Bcast:194.168.10.255
Mask:255.255.255.0
inet6 addr: fe80::202:c902:20:590e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:94 errors:0 dropped:0 overruns:0 frame:0
TX packets:31 errors:0 dropped:9 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:23286 (22.7 KiB) TX bytes:7260 (7.0 KiB)
_________________________________________________________
[root at centos5:master /opt/CA]# ibstat
CA 'mthca0'
CA type: MT25208
Number of ports: 2
Firmware version: 5.1.0
Hardware version: a0
Node GUID: 0x0002c9020020590c
System image GUID: 0x0002c9020020590f
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x02510a6a
Port GUID: 0x0002c9020020590d
Port 2:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 4
LMC: 0
SM lid: 1
Capability mask: 0x02510a68
Port GUID: 0x0002c9020020590e
Thanks in advance,
Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070606/58a5ac49/attachment.html>
More information about the general
mailing list