[libfabric-users] subtle difference between sockets and tcp; ofi_rxm
sean.hefty at intel.com
Fri Apr 30 09:24:58 PDT 2021
> When I use tcp;ofi_rxm if I start the client node first and send a message to the
> master, the message fails with (ret == -FI_EAGAIN) and unfortunately, if I keep
> retrying whilst starting the master node, the message does not ever complete.
This is a bug. It's supposed to work.
> Is there anything I can do to make the tcp version behave the same way as the sockets
We need to figure out what the problem is any why the connection isn't retried.
More information about the Libfabric-users