[libfabric-users] intel mpi with libfabric

Mohammed Shaheen m_shaheen1984 at yahoo.com
Mon Dec 3 04:44:15 PST 2018


 Hi Dmitry,
the problem appeared only once. It did not appear again.
Since you brought it up, do I have to set the FI_PROVIDER variable? I mean is it not enough to source the mpivars script?if I source the mpivars script and set the debug level to 3, I get the right provider verbs;ofi_rxm as seen below.


lce62:~ # mpirun -np 8 -perhost 4 --hostfile hosts ./test.e
[0] MPI startup(): libfabric version: 1.7.0a1-impi
[0] MPI startup(): libfabric provider: verbs;ofi_rxm
Hello world from process 7 of 8
Hello world from process 3 of 8
Hello world from process 5 of 8
Hello world from process 1 of 8
Hello world from process 4 of 8
Hello world from process 6 of 8
Hello world from process 2 of 8
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       11242    lce63      {0,1,2,3,4,5,24,25,26,27,28,29}
[0] MPI startup(): 1       11243    lce63      {6,7,8,9,10,11,30,31,32,33,34,35}
[0] MPI startup(): 2       11244    lce63      {12,13,14,15,16,17,36,37,38,39,40,41}
[0] MPI startup(): 3       11245    lce63      {18,19,20,21,22,23,42,43,44,45,46,47}
[0] MPI startup(): 4       29238    lce62      {0,1,2,3,4,5,24,25,26,27,28,29}
[0] MPI startup(): 5       29239    lce62      {6,7,8,9,10,11,30,31,32,33,34,35}
[0] MPI startup(): 6       29240    lce62      {12,13,14,15,16,17,36,37,38,39,40,41}
[0] MPI startup(): 7       29241    lce62      {18,19,20,21,22,23,42,43,44,45,46,47}
Hello world from process 0 of 8


Regards,Mohammed Shaheen


    Am Montag, 3. Dezember 2018, 13:37:20 MEZ hat Gladkov, Dmitry <dmitry.gladkov at intel.com> Folgendes geschrieben:  
 
 #yiv1001371284 #yiv1001371284 -- _filtered #yiv1001371284 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv1001371284 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv1001371284 {panose-1:2 11 6 9 4 5 4 2 2 4;} _filtered #yiv1001371284 {panose-1:0 0 0 0 0 0 0 0 0 0;}#yiv1001371284 #yiv1001371284 p.yiv1001371284MsoNormal, #yiv1001371284 li.yiv1001371284MsoNormal, #yiv1001371284 div.yiv1001371284MsoNormal {margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:New serif;}#yiv1001371284 a:link, #yiv1001371284 span.yiv1001371284MsoHyperlink {color:#0563C1;text-decoration:underline;}#yiv1001371284 a:visited, #yiv1001371284 span.yiv1001371284MsoHyperlinkFollowed {color:#954F72;text-decoration:underline;}#yiv1001371284 span.yiv1001371284EmailStyle17 {font-family:sans-serif;color:#1F497D;}#yiv1001371284 .yiv1001371284MsoChpDefault {font-size:10.0pt;} _filtered #yiv1001371284 {margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv1001371284 div.yiv1001371284WordSection1 {}#yiv1001371284 
Hi Mohammed,
 
  
 
Can we ask to submit a ticket in IDZ - https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology?
 
Did you use Verbs provider in the test? Please, set FI_PROVIDER=verbs to ensure it. Also if it possible, please send FI_LOG_LEVEL=debug log, it can help for further investigations.
 
--
 
Dmitry
 
  
 
From: Mohammed Shaheen [mailto:m_shaheen1984 at yahoo.com]
Sent: Thursday, November 29, 2018 2:34 PM
To: libfabric-users at lists.openfabrics.org; ofiwg at lists.openfabrics.org; Ilango, Arun <arun.ilango at intel.com>; Gladkov, Dmitry <dmitry.gladkov at intel.com>; Hefty, Sean <sean.hefty at intel.com>
Subject: Re: [libfabric-users] intel mpi with libfabric
 
  
 
Hi,
 
  
 
Now I am trying to use Intel MPI U1 with the libfabric that comes along with it. I get the following error
 
lce62:~ # mpirun -np 8 -perhost 4 --hostfile hosts ./test.e
Hello world from process 6 of 8
Hello world from process 4 of 8
Hello world from process 7 of 8
Hello world from process 5 of 8
Hello world from process 0 of 8
Hello world from process 1 of 8
Hello world from process 2 of 8
Hello world from process 3 of 8
Abort(809595151) on node 4 (rank 4 in comm 0): Fatal error in PMPI_Finalize: Other MPI error, error stack:
PMPI_Finalize(356).............: MPI_Finalize failed
PMPI_Finalize(266).............:
MPID_Finalize(959).............:
MPIDI_NM_mpi_init_hook(1299)...:
MPIR_Reduce_intra_binomial(157):
MPIC_Send(149).................:
MPID_Send(256).................:
MPIDI_OFI_send_normal(429).....:
MPIDI_OFI_send_handler(733)....: OFI tagged send failed (ofi_impl.h:733:MPIDI_OFI_send_handler:Invalid argument)
[cli_4]: readline failed
 
  
 
However, when I set I_MPI_DEBUG to anything, the error disappears and it works successfully. Any thoughts?
 
  
 
Regards
 
Mohammed Shaheen
 
  
 
Am Montag, 26. November 2018, 18:10:58 MEZ hat Hefty, Sean <sean.hefty at intel.com> Folgendes geschrieben: 
 
  
 
  
 
> I have tried with the development version from the master branch. I
> get the following errors while building the library (make)
> prov/verbs/src/verbs_ep.c: In function
> 'fi_ibv_msg_xrc_ep_atomic_write':
> prov/verbs/src/verbs_ep.c:1770: error: unknown field 'qp_type'
> specified in initializer

These errors are coming from having an older verbs.h file.  The qp_type field was added as part of XRC support, about 5 years ago.  I think this maps to v13.

> prov/verbs/src/verbs_ep.c:1770: error: unknown field 'qp_type'
> specified in initializer

This is the same error, which suggests that it is still picking up an old verbs.h file.  Maybe Mellanox ships a different verbs.h file that what's upstream, but I doubt it would remove fields.
 


- Sean
 

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20181203/08de5974/attachment-0001.html>


More information about the Libfabric-users mailing list