[ofa-general] GPFS node loses IB-connection
SEGERS Koen
Koen.SEGERS at VRT.BE
Tue May 22 06:43:59 PDT 2007
I did the iperf tests on servers with OFED-1.2-RC3.
It also gives the same result. Actually, it is even worse: when the
interface dies, it gets in PORT_INIT state, but it doesn't go to
PORT_ACTIVE again. At least not within 10 minutes.
I'll give you the test script I ran:
ssh 10.224.158.114 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
5001 &
ssh 10.224.158.114 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
5002 &
ssh 10.224.158.114 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
5003 &
ssh 10.224.158.115 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
6001 &
ssh 10.224.158.115 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
6002 &
ssh 10.224.158.115 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
6003 &
ssh 10.224.158.116 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
7001 &
ssh 10.224.158.116 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
7002 &
ssh 10.224.158.116 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
7003 &
ssh 10.224.158.117 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
8001 &
ssh 10.224.158.117 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
8002 &
ssh 10.224.158.117 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf -s -p
8003 &
sleep 5
for i in 14 15 16 17
do
ssh 10.224.158.111 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf
-c 192.168.2.$i -p $((i-9))001 -t 120 -d -P 5 &
ssh 10.224.158.112 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf
-c 192.168.2.$i -p $((i-9))002 -t 120 -d -P 5 &
ssh 10.224.158.113 LD_PRELOAD=libsdp.so SIMPLE_LIBSDP=OK iperf
-c 192.168.2.$i -p $((i-9))003 -t 120 -d -P 5 &
done
Any ideas?
Regards,
Koen
________________________________
Van: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] Namens SEGERS Koen
Verzonden: dinsdag 22 mei 2007 10:55
Aan: Ami Perlmutter; Shirley Ma
CC: general-bounces at lists.openfabrics.org; general at lists.openfabrics.org
Onderwerp: RE: [ofa-general] GPFS node loses IB-connection
GPFS keeps its connection constantly open.
We did some more tests with iperf:
If we don't run bidirectional tests, all connections keeps running
smoothly. If we add bidirectional tests, it becomes unstable. Certainly
if this is done on multiple nodes. Is this normal?
The failed iperf tests give the same error in the switch log:
May 22 08:14:59 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Generate SM
OUT_OF_SERVICE trap for
GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:08:a8:71
May 22 08:14:59 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Generate SM
DELETE_MC_GROUP trap for
GID=ff:12:60:1b:ff:ff:00:00:00:00:00:01:ff:08:a8:71
May 22 08:14:59 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration
caused by discovering removed ports
May 22 08:15:00 topspin-120sc ib_sm.x[621]: %IB-6-INFO: Program switch
port state to down, node=00:05:ad:00:00:0b:a2:cc, port= 6, due to
non-responding CA
May 22 08:15:00 topspin-120sc port_mgr.x[497]: %PORT-6-INFO: port down -
port=1/6, type=ib4xTXP
May 22 08:15:00 topspin-120sc diag_mgr.x[508]: %DIAG-6-INFO: in
portTblFindEntry() - IfIndex=70(1/6)
May 22 08:15:00 topspin-120sc diag_mgr.x[508]: %DIAG-6-INFO: cannot find
entry - IfIndex=70(1/6)
May 22 08:15:04 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration
caused by discovering new ports
May 22 08:15:04 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration
caused by multicast membership change
May 22 08:15:04 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Generate SM
IN_SERVICE trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:08:a8:71
May 22 08:15:05 topspin-120sc port_mgr.x[497]: %PORT-6-INFO: port up -
port=1/6, type=ib4xTXP
May 22 08:15:07 topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM
CREATE_MC_GROUP trap for
GID=ff:12:60:1b:ff:ff:00:00:00:00:00:01:ff:08:a8:71
May 22 08:15:08 topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration
caused by multicast membership change
RC3 is just installed. Results will follow soon.
Regards,
Koen
________________________________
Van: Ami Perlmutter [mailto:amip at dev.mellanox.co.il]
Verzonden: dinsdag 22 mei 2007 10:33
Aan: Shirley Ma
CC: SEGERS Koen; general-bounces at lists.openfabrics.org;
general at lists.openfabrics.org
Onderwerp: Re: [ofa-general] GPFS node loses IB-connection
does the application constantly open and close connections?
*** Disclaimer ***
Vlaamse Radio- en Televisieomroep
Auguste Reyerslaan 52, 1043 Brussel
nv van publiek recht
BTW BE 0244.142.664
RPR Brussel
http://www.vrt.be/disclaimer
*** Disclaimer ***
Vlaamse Radio- en Televisieomroep
Auguste Reyerslaan 52, 1043 Brussel
nv van publiek recht
BTW BE 0244.142.664
RPR Brussel
http://www.vrt.be/disclaimer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070522/d247e6f9/attachment.html>
More information about the general
mailing list