[libfabric-users] interoperability between Verbs provider and MPI (MVAPICH)
Phil Carns
carns at mcs.anl.gov
Mon Nov 6 18:37:25 PST 2017
On 11/06/2017 04:48 PM, Hefty, Sean wrote:
>> I have been able to use the verbs provider successfully for some
>> simple test programs, but it does not seem to work when I try to use
>> it from within an MPI program using MVAPICH.
>>
>> I can come back later with more details if this is something that
>> warrants deeper investigation, but before I do that: are there any
>> known conflicts between how OFI and MVAPICH use the verbs library, or
>> any particular options that I need to set to make this work?
>>
>> To be clear, I have not tried MVAPICH over OFI yet. This is a normal,
>> oldish MVAPICH build. I was trying some quick tests with what's
>> already available on a test system.
> I'm not following the test setup here. When you say 'verbs provider' what do you mean? If you're not running MVAPICH over OFI, then this sounds like a problem with libibverbs and its related provider libraries. If that's the case, then the linux-rdma or mvapich mailing lists may be able to provide better help.
>
> If you are attempting to run MVAPICH over libfabric, this is the right mailing list. But, AFAIK, MVAPICH has not been ported to run over libfabric.
>
> - Sean
Thanks Sean. I just meant that I have an MPI program that (in addition
to it's MPI calls) tries to use OFI directly to contact an external
service. The MVAPICH library is using libibverbs, not libfabric. The
libfabric library is also using libibverbs.
I was hoping that maybe there might be a magic environment variable that
made MPI and OFI get along on verbs before I dug too far into the
problem :) I know that sounds like a random thing to ask about, but the
reason that I even thought of it is because the PSM2 provider has a
/FI_PSM2_NAME_SERVER/ parameter that is is automatically toggled if
libfabric detects the presence of OpenMPI or MPICH (see
/https://ofiwg.github.io/libfabric/v1.5.0/man/fi_psm2.7.html/). I was
wondering if the verbs provider did anything similar to subtly change
it's behavior if it thinks that MPI is being used that would cause my
OFI test code to work Ok in a standalone program but not within an MPI
program.
At any rate, it was just a shot in the dark. I'll isolate the problem
properly and come back (or ask on the rdma or mvapich mailing lists as
appropriate) if I get stuck.
thanks!
-Phil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20171106/823b8e44/attachment.html>
More information about the Libfabric-users
mailing list