[libfabric-users] Can only use one NIC port in libfabric 1.6.1

Jörn Schumacher joern.schumacher at cern.ch
Tue Aug 21 02:17:04 PDT 2018


Dear libfabric developers,

I recently updated to libfabric 1.6.1 (from 1.4). It looks like in this 
release we can only use on port of our NIC (Mellanox ConnectX-5 with RoCE).

On the receiving side we listen for a RC. We monitor the event queue 
with a file descriptor + epoll. On one port of the NIC this works fine, 
but if the request comes in on the second port (on a different IP 
subnet) this fails: we get an epoll notification, but then the 
subsequent fi_eq_sread(...) call yields FI_EAGAIN.

I open a single domain. This worked fine in the earlier libfabric. 
Reading the documentation a bit I understand that a domain is tied to a 
port. Does this mean I need to open multiple domains?


Thanks and best regards,
Jörn


More information about the Libfabric-users mailing list