[libfabric-users] intel mpi with libfabric

Gladkov, Dmitry dmitry.gladkov at intel.com
Mon Dec 3 04:37:12 PST 2018


Hi Mohammed,

Can we ask to submit a ticket in IDZ - https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology?
Did you use Verbs provider in the test? Please, set FI_PROVIDER=verbs to ensure it. Also if it possible, please send FI_LOG_LEVEL=debug log, it can help for further investigations.
--
Dmitry

From: Mohammed Shaheen [mailto:m_shaheen1984 at yahoo.com]
Sent: Thursday, November 29, 2018 2:34 PM
To: libfabric-users at lists.openfabrics.org; ofiwg at lists.openfabrics.org; Ilango, Arun <arun.ilango at intel.com>; Gladkov, Dmitry <dmitry.gladkov at intel.com>; Hefty, Sean <sean.hefty at intel.com>
Subject: Re: [libfabric-users] intel mpi with libfabric

Hi,

Now I am trying to use Intel MPI U1 with the libfabric that comes along with it. I get the following error
lce62:~ # mpirun -np 8 -perhost 4 --hostfile hosts ./test.e
Hello world from process 6 of 8
Hello world from process 4 of 8
Hello world from process 7 of 8
Hello world from process 5 of 8
Hello world from process 0 of 8
Hello world from process 1 of 8
Hello world from process 2 of 8
Hello world from process 3 of 8
Abort(809595151) on node 4 (rank 4 in comm 0): Fatal error in PMPI_Finalize: Other MPI error, error stack:
PMPI_Finalize(356).............: MPI_Finalize failed
PMPI_Finalize(266).............:
MPID_Finalize(959).............:
MPIDI_NM_mpi_init_hook(1299)...:
MPIR_Reduce_intra_binomial(157):
MPIC_Send(149).................:
MPID_Send(256).................:
MPIDI_OFI_send_normal(429).....:
MPIDI_OFI_send_handler(733)....: OFI tagged send failed (ofi_impl.h:733:MPIDI_OFI_send_handler:Invalid argument)
[cli_4]: readline failed

However, when I set I_MPI_DEBUG to anything, the error disappears and it works successfully. Any thoughts?

Regards
Mohammed Shaheen

Am Montag, 26. November 2018, 18:10:58 MEZ hat Hefty, Sean <sean.hefty at intel.com<mailto:sean.hefty at intel.com>> Folgendes geschrieben:


> I have tried with the development version from the master branch. I
> get the following errors while building the library (make)
> prov/verbs/src/verbs_ep.c: In function
> 'fi_ibv_msg_xrc_ep_atomic_write':
> prov/verbs/src/verbs_ep.c:1770: error: unknown field 'qp_type'
> specified in initializer

These errors are coming from having an older verbs.h file.  The qp_type field was added as part of XRC support, about 5 years ago.  I think this maps to v13.

> prov/verbs/src/verbs_ep.c:1770: error: unknown field 'qp_type'
> specified in initializer

This is the same error, which suggests that it is still picking up an old verbs.h file.  Maybe Mellanox ships a different verbs.h file that what's upstream, but I doubt it would remove fields.


- Sean

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20181203/702cd96f/attachment.html>


More information about the Libfabric-users mailing list