[ofa-general] GPFS node loses IB-connection

SEGERS Koen Koen.SEGERS at VRT.BE
Mon May 21 11:50:31 PDT 2007


The same as in dmesg.
 
The output for the failing node:
May 18 13:02:51 gpfswhbe1s1 mmfs: Error=MMFS_PHOENIX, ID=0xAB429E38, Tag=4997901:   Reason code 668 Failure Reason Lost membership in cluster enterprise.universe. Unmounting file systems.
May 18 13:02:51 gpfswhbe1s1 mmfs: Error=MMFS_PHOENIX, ID=0xAB429E38, Tag=4997901:   
May 18 13:03:36 gpfswhbe1s1 kernel: GPFS Deadman Switch timer [0] has expired; IOs in progress: 0
May 18 13:04:11 gpfswhbe1s1 kernel: Badness in do_exit at kernel/exit.c:807
May 18 13:04:11 gpfswhbe1s1 kernel: 
May 18 13:04:11 gpfswhbe1s1 kernel: Call Trace: <ffffffff80133370>{do_exit+80} <ffffffff80133c17>{sys_exit_group+0}
May 18 13:04:11 gpfswhbe1s1 kernel:        <ffffffff8010a7be>{system_call+126}
May 18 13:04:11 gpfswhbe1s1 kernel: Badness in do_exit at kernel/exit.c:807
May 18 13:04:11 gpfswhbe1s1 kernel: 
May 18 13:04:11 gpfswhbe1s1 kernel: Call Trace: <ffffffff80133370>{do_exit+80} <ffffffff80133c17>{sys_exit_group+0}
May 18 13:04:11 gpfswhbe1s1 kernel:        <ffffffff8010a7be>{system_call+126}
May 18 13:18:57 gpfswhbe1s1 sshd[15090]: Accepted publickey for root from 192.168.1.1 port 52281 ssh2
May 18 13:25:12 gpfswhbe1s1 syslog-ng[3705]: STATS: dropped 0
 
Today we also did some tests with iperf using sdp. The tests worked fine, as long as we didn't use the parrallel option (-P <number>). This option starts multiple client threads to connect to the server. As soon as we started the command, the interface died.
 
I found it very strange. Didn't anyone get this problem? Is it still a problem in RC3?
 
Tomorrow we will do more tests to pinpoint the problem even further.
We will also build RPMS for the RC3. Hopefully this helps.
 
Regards,
 
Koen

________________________________

Van: Shirley Ma [mailto:xma at us.ibm.com]
Verzonden: ma 21/05/2007 17:41
Aan: SEGERS Koen
CC: general at lists.openfabrics.org; general-bounces at lists.openfabrics.org; Tziporet Koren
Onderwerp: RE: [ofa-general] GPFS node loses IB-connection



Hello,

What's the output of /var/log/messages when you hitting this problem?

Shirley Ma

*** Disclaimer ***

Vlaamse Radio- en Televisieomroep
Auguste Reyerslaan 52, 1043 Brussel

nv van publiek recht
BTW BE 0244.142.664
RPR Brussel
http://www.vrt.be/disclaimer
 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070521/78c59939/attachment.html>


More information about the general mailing list