<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.5730.11" name=GENERATOR></HEAD>
<BODY text=#000000 bgColor=#ffffff>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff size=2>Hi
David,</FONT></SPAN></DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff size=2>If you
are using OFED-1.1 stack and OSU MVAPICH provided with the OFED-1.1 package as
your MPI layer,</FONT></SPAN></DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff size=2>the
attached patch should solve your problem.</FONT></SPAN></DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff
size=2>Please, let me know if that helped.</FONT></SPAN></DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff
size=2>Regards,</FONT></SPAN></DIV>
<DIV><SPAN class=721402422-01122006><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Boris Shpolyansky</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Application
Engineer</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Mellanox Technologies
Inc.</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>2900 Stender Way</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Santa Clara, CA
95054</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Tel.: (408) 916
0014</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Fax: (408) 970 3403</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>Cell: (408) 834
9365</FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial
size=2>www.mellanox.com</FONT></DIV><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> openib-general-bounces@openib.org
[mailto:openib-general-bounces@openib.org] <B>On Behalf Of </B>David
Costa<BR><B>Sent:</B> Friday, December 01, 2006 2:21 PM<BR><B>To:</B>
openib-general@openib.org; David.Costa@Sun.COM; Robert Houk; Anthony
Vinciguerra; Thomas Babbit<BR><B>Subject:</B> [openib-general] HPCC benchmark
aborts at MPIRandomAccess test<BR></FONT><BR></DIV>
<DIV></DIV>Hello all,<BR><BR>I am running the HPCC benchmark on a Sun Blade 8000
blade server. I have two blades running RHEL4U3 and SLESSP3 respectively with 32
GBytes of memory each. The HPCC benchmark is running on a sun developed IB
module that uses the Mellanox 25204 chips. When it gets to the MPIRandomAccess
test, it immediately fails and I see the following messages listed
below.<BR><BR>Does anyone know what the messages mean, and a possible
underlying cause? Please reply to me directly as I am not subscribed to
this list.<BR><BR>Thank you,<BR><BR>Dave Costa<BR><A
class=moz-txt-link-abbreviated
href="mailto:david.costa@sun.com">david.costa@sun.com</A><BR><FONT
face="Courier New, Courier, monospace"><BR><BR>[root@an1-bl0 ~]# mpirun_rsh -rsh
-np 32 -hostfile /root/hostfile /usr/local/bin/hpcc<BR>24 - MPI_CANCEL :
Internal MPI error!<BR>[24] [] Aborting Program!<BR>mpirun_rsh: Abort signaled
from [24]<BR>26 - MPI_CANCEL : Internal MPI error!<BR>[26] [] Aborting
Program!<BR>15 - MPI_CANCEL : Internal MPI error!<BR>[15] [] Aborting
Program!<BR>18 - MPI_CANCEL : Internal MPI error!<BR>[18] [] Aborting
Program!<BR>22 - MPI_CANCEL : Internal MPI error!<BR>[22] [] Aborting
Program!<BR>4 - MPI_CANCEL : Internal MPI error!<BR>[4] [] Aborting
Program!<BR>13 - MPI_CANCEL : Internal MPI error!<BR>[13] [] Aborting
Program!<BR>11 - MPI_CANCEL : Internal MPI error!<BR>16 - MPI_CANCEL : Internal
MPI error!<BR>[16] [] Aborting Program!<BR>[11] [] Aborting Program!<BR>28 -
MPI_CANCEL : Internal MPI error!<BR>[28] [] Aborting Program!<BR>[19] Abort:
[an1-bl1:19] Got completion with error, code=12<BR> at line 2365 in file
viacheck.c<BR>[23] Abort: [an1-bl1:23] Got completion with error,
code=12<BR> at line 2365 in file viacheck.c<BR>[17] Abort: [an1-bl1:17] Got
completion with error, code=12<BR> at line 2365 in file
viacheck.c<BR>done.</FONT> </BODY></HTML>