[libfabric-users] Verbs provider not permitting FI_EP_MSG

Steve Welch swelch at systemfabricworks.com
Thu Jan 16 09:28:07 PST 2020



> On Jan 16, 2020, at 11:03 AM, Philip Davis <philip.e.davis at rutgers.edu> wrote:
> 
> Hi Steve,
> 
> Thanks for the quick response.
> 
> I am expecting to use the the rxm provider for verbs, but in fi_info I do not see an FI_EP_MSG-type verbs provider.

Could you provide the output for “ibv_devinfo -v” and “lsmod | grep ib”?

Steve

> 
> provider: tcp;ofi_rxm
>     fabric: TCP-IP
>     domain: tcp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXM
> provider: tcp;ofi_rxm
>     fabric: TCP-IP
>     domain: tcp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXM
> provider: tcp;ofi_rxm
>     fabric: TCP-IP
>     domain: tcp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXM
> provider: tcp;ofi_rxm
>     fabric: TCP-IP
>     domain: tcp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXM
> provider: tcp;ofi_rxm
>     fabric: TCP-IP
>     domain: tcp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXM
> provider: tcp;ofi_rxm
>     fabric: TCP-IP
>     domain: tcp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXM
> provider: verbs;ofi_rxd
>     fabric: IB-0xfe80000000000000
>     domain: mlx4_0-dgram
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: UDP;ofi_rxd
>     fabric: UDP-IP
>     domain: udp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: UDP;ofi_rxd
>     fabric: UDP-IP
>     domain: udp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: UDP;ofi_rxd
>     fabric: UDP-IP
>     domain: udp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: UDP;ofi_rxd
>     fabric: UDP-IP
>     domain: udp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: UDP;ofi_rxd
>     fabric: UDP-IP
>     domain: udp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: UDP;ofi_rxd
>     fabric: UDP-IP
>     domain: udp
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_RXD
> provider: verbs
>     fabric: IB-0xfe80000000000000
>     domain: mlx4_0-dgram
>     version: 1.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_IB_UD
> provider: UDP
>     fabric: UDP-IP
>     domain: udp
>     version: 1.1
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_UDP
> provider: UDP
>     fabric: UDP-IP
>     domain: udp
>     version: 1.1
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_UDP
> provider: UDP
>     fabric: UDP-IP
>     domain: udp
>     version: 1.1
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_UDP
> provider: UDP
>     fabric: UDP-IP
>     domain: udp
>     version: 1.1
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_UDP
> provider: UDP
>     fabric: UDP-IP
>     domain: udp
>     version: 1.1
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_UDP
> provider: UDP
>     fabric: UDP-IP
>     domain: udp
>     version: 1.1
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_UDP
> provider: sockets
>     fabric: 10.1.0.0/16
>     domain: em1
>     version: 2.0
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 10.1.0.0/16
>     domain: em1
>     version: 2.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 10.1.0.0/16
>     domain: em1
>     version: 2.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 10.157.14.0/24
>     domain: em2
>     version: 2.0
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 10.157.14.0/24
>     domain: em2
>     version: 2.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 10.157.14.0/24
>     domain: em2
>     version: 2.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: fe80::/64
>     domain: em1
>     version: 2.0
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: fe80::/64
>     domain: em1
>     version: 2.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: fe80::/64
>     domain: em1
>     version: 2.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: fe80::/64
>     domain: em2
>     version: 2.0
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: fe80::/64
>     domain: em2
>     version: 2.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: fe80::/64
>     domain: em2
>     version: 2.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 127.0.0.0/8
>     domain: lo
>     version: 2.0
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 127.0.0.0/8
>     domain: lo
>     version: 2.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: 127.0.0.0/8
>     domain: lo
>     version: 2.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: ::1/128
>     domain: lo
>     version: 2.0
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: ::1/128
>     domain: lo
>     version: 2.0
>     type: FI_EP_DGRAM
>     protocol: FI_PROTO_SOCK_TCP
> provider: sockets
>     fabric: ::1/128
>     domain: lo
>     version: 2.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SOCK_TCP
> provider: tcp
>     fabric: TCP-IP
>     domain: tcp
>     version: 0.1
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: tcp
>     fabric: TCP-IP
>     domain: tcp
>     version: 0.1
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: tcp
>     fabric: TCP-IP
>     domain: tcp
>     version: 0.1
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: tcp
>     fabric: TCP-IP
>     domain: tcp
>     version: 0.1
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: tcp
>     fabric: TCP-IP
>     domain: tcp
>     version: 0.1
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: tcp
>     fabric: TCP-IP
>     domain: tcp
>     version: 0.1
>     type: FI_EP_MSG
>     protocol: FI_PROTO_SOCK_TCP
> provider: shm
>     fabric: shm
>     domain: shm
>     version: 1.0
>     type: FI_EP_RDM
>     protocol: FI_PROTO_SHM
> 
> Thanks,
> Philip
> 
>> On Jan 16, 2020, at 11:03 AM, Steve Welch <swelch at systemfabricworks.com <mailto:swelch at systemfabricworks.com>> wrote:
>> 
>> Hi Phillip,
>> 
>> Since you are specifying an FI_EP_RDM in your hints I assume you want to utilize the RXM provider on top of the Verbs core provider (i.e. ofi_rxm;verbs). The Verbs provider does not offer native FI_RDM_EP support. To use either XRC (or FI_EP_RDM endpoint)  you will have to use RXM, but I am unaware of any IB provider that supported XRC that did not support RC.
>> 
>> If you issue a 'fi_info -p verbs -v’ it will list all the verbs domains supported and the underlying protocol and you could verify if RC should be supported (via RXM for FI_EP_RDM). If you issue 'fi_info -p “ofi_rxm;verbs”', you should see multiple domains for the “ofi_rxm;verbs” provider combination. XRC domains have the “-xrc” suffix.
>> 
>> If you must use XRC and the RXM/Verbs combination then you will need to set the environment variable FI_OFI_RXM_USE_SRX=1 and RXM will handle the shared RX details.
>> 
>> Steve
>> 
>> 
>>> On Jan 16, 2020, at 8:56 AM, Philip Davis <philip.e.davis at rutgers.edu <mailto:philip.e.davis at rutgers.edu>> wrote:
>>> 
>>> Hello,
>>> 
>>> I am working with a user that is running on an older Infiniband cluster. Using libfaric with the following hints:
>>> 
>>> hints->caps = FI_MSG | FI_SEND | FI_RECV | FI_REMOTE_READ |
>>>                  FI_REMOTE_WRITE | FI_RMA | FI_READ | FI_WRITE;
>>>    hints->mode = FI_CONTEXT | FI_LOCAL_MR | FI_CONTEXT2 | FI_MSG_PREFIX |
>>>                  FI_ASYNC_IOV | FI_RX_CQ_DATA;
>>>    hints->domain_attr->mr_mode = FI_MR_BASIC;
>>>    hints->domain_attr->control_progress = FI_PROGRESS_AUTO;
>>>    hints->domain_attr->data_progress = FI_PROGRESS_AUTO;
>>>    hints->ep_attr->type = FI_EP_RDM;
>>> 
>>> 
>>> No verbs providers are found. Looking through the debug output, I suspect this is the crucial line:
>>> 
>>> libfabric:verbs:fabric:fi_ibv_get_matching_info():1213<info> hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints
>>> 
>>> I take it that the underlying hardware is only compatible with FI_PROTO_RDMA_CM_IB_XRC protocol for MSG endpoints, and it looks like I need to have FI_SHARED_CONTEXT enabled for these endpoints to be supported. I’m having some trouble understanding the implications of using FI_SHARED_CONTEXT. If I only ever use one endpoint, is there any functional or performance impact to setting this? I’d rather not change to using shared contexts unconditionally, so is there a good way for me to detect this situation other than to do a maximally permissive fi_getinfo and iterate through the verbs results?
>>> 
>>> Thanks,
>>> Philip
>>> _______________________________________________
>>> Libfabric-users mailing list
>>> Libfabric-users at lists.openfabrics.org <mailto:Libfabric-users at lists.openfabrics.org>
>>> https://lists.openfabrics.org/mailman/listinfo/libfabric-users <https://lists.openfabrics.org/mailman/listinfo/libfabric-users>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20200116/f69597c3/attachment-0001.htm>


More information about the Libfabric-users mailing list