[libfabric-users] Failure of fi_getinfo with verbs backend
Jörn Schumacher
joern.schumacher at cern.ch
Thu Jun 23 07:39:47 PDT 2016
Hello,
I am trying to run a libfabric program with the verbs provider on a new
system. The program works on a local test machine, but on the new system
fi_getinfo fails. When I run fi_info on the new machine, the verbs
provider is there and seems to have the same configuration as on my
local testbed. The only difference is that addr_format on the new
machine is FI_SOCKADDR_IN, whereas locally it is FI_SOCKADDR_IN6, but I
use IPv4 addressing anyway.
The output of fi_info is here as well as a small test program that
produces the error:
https://gist.github.com/joerns/c0631a32840b96d25380cc1e91a1e7a0
Here is the relevant code snippet:
> hints = fi_allocinfo();
> hints->ep_attr->type = FI_EP_MSG;
> hints->caps = FI_MSG;
> hints->mode = FI_LOCAL_MR;
>
> if(ret = fi_getinfo(FI_VERSION(1, 1), "127.0.0.1", "12345", 0, hints, &fi))
> {
> ERROR("fi_getinfo failed: %d '%s'", ret, fi_strerror(-ret));
> }
Which fails with
> error: fi_getinfo failed: -61 'No data available'
The output of fi_info suggests the verbs provider has the capabilities I
request. Oddly enough, it works on the server side, where the only
difference is an added FI_SOURCE flag to the fi_getinfo call.
What could be a reason that fi_getinfo fails?
Thanks a lot,
Jörn
More information about the Libfabric-users
mailing list