[ofiwg] noob questions
Byrne, John (Labs)
john.l.byrne at hpe.com
Wed Nov 13 11:26:21 PST 2019
You only mention the dgram and msg types and the mtl_ofi component wants rdm. If you don’t support rdm, I would have expected your getinfo routine to return error -61. You can try using the ofi_rxm provider with your provider to add rdm support, replacing verbs in “--mca mtl_ofi_provider_include verbs;ofi_rxm” with your provider.
openmpi transport selection is complex. Adding insane levels of verbosity can help you understand what is happening. I tend to use: --mca mtl_base_verbose 100 --mca btl_base_verbose 100 --mca pml_base_verbose 100
From: ofiwg [mailto:ofiwg-bounces at lists.openfabrics.org] On Behalf Of Don Fry
Sent: Wednesday, November 13, 2019 10:54 AM
To: ofiwg at lists.openfabrics.org
Subject: [ofiwg] noob questions
I have written a libfabric provider for our hardware and it passes all the fabtests I expect it to (dgram and msg). I am trying to run some MPI tests using libfabrics under openmpi (4.0.2). When I run a simple ping-pong test using mpirun it sends and receives the messages using the tcp/ip protocol. It does call my fi_getinfo routine, but doesn't use my provider send/receive routines. I have rebuilt the libfabric library disabling sockets, then again --disable-tcp, then --disable-udp, and fi_info reports fewer and fewer providers until it only lists my provider, but each time I run the mpi test, it still uses the ip protocol to exchange messages.
When I configured openmpi I specified --with-libfabric=/usr/local/ and the libfabric library is being loaded and executed.
I am probably doing something obviously wrong, but I don't know enough about MPI or maybe libfabric, so need some help. If this is the wrong list, redirect me.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ofiwg