<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3059" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=766094522-10032007><FONT face=Arial
color=#0000ff size=2>What version of Intel MPI are you
using?</FONT></SPAN></DIV><BR>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> general-bounces@lists.openfabrics.org
[mailto:general-bounces@lists.openfabrics.org] <B>On Behalf Of </B>Boris
Shpolyansky<BR><B>Sent:</B> Friday, March 09, 2007 8:40 PM<BR><B>To:</B>
general@lists.openfabrics.org<BR><B>Subject:</B> [ofa-general] uDAPL
question<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV><SPAN class=965051803-10032007><FONT face=Arial><FONT size=2>Hi<SPAN
class=546583204-10032007><FONT
color=#0000ff>, </FONT></SPAN></FONT></FONT></SPAN></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=965051803-10032007></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>I'm trying to get
simple Intel MPI benchmark running over IB (uDAPL) using OFED-1.1
stack.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>I'm consistently
getting the following error:</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=965051803-10032007></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>[root@ibd005 ~]#
./runjob_I_MPI.boris 2<BR>Task 0 of 2 tasks started on host
ibd005.ibd.mti.com<BR>clock_resolution = 1.00e-06 s<BR>Task 1 of 2 tasks
started on host ibd006.ibd.mti.com<BR>[0:ibd005] unexpected DAPL event 4006
from 1:ibd006<BR>[1:ibd006] unexpected DAPL event 4006 from 0:ibd005<BR>rank 0
in job 14 ibd005_36193 caused collective abort of all
ranks<BR> exit status of rank 0: return code 254
<BR></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>I did some digging
and found out that event 4006 (actually 0x4006) means
DAT_CONNECTION_EVENT_BROKEN</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>and it is returned
by function dat_rmr_bind. </SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>So my question is
why this function consistently fails.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>I'm using standard
dat.conf file:</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=965051803-10032007></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>OpenIB-cma u1.2
nonthreadsafe default /usr/local/ofed/lib64/libdaplcma.so mv_dapl.1.2 "ib0 0"
""<BR></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=965051803-10032007>Appreciate your
help,</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=965051803-10032007> </DIV></SPAN></FONT>
<DIV dir=ltr align=left><FONT face=Arial><FONT size=2>Boris
Shpolyansky</FONT></FONT><FONT face=Arial><FONT size=2><SPAN
class=546583204-10032007> </SPAN></FONT></FONT></DIV></BLOCKQUOTE></BODY></HTML>