[libfabric-users] questions on erbs provider for RDMA

Xiong, Jianxin jianxin.xiong at intel.com
Wed Jun 26 11:07:24 PDT 2024


Connectionless is the semantics the application sees. The provider may need to make connections internally in order to support a connectionless API. That’s the case for ofi_rxm -- because it is RDM over MSG, exactly as you said.

From: Niyaz Murshed <Niyaz.Murshed at arm.com>
Sent: Wednesday, June 26, 2024 10:50 AM
To: Xiong, Jianxin <jianxin.xiong at intel.com>; libfabric-users at lists.openfabrics.org
Cc: nd <nd at arm.com>
Subject: Re: questions on erbs provider for RDMA

Hi,
A follow up question.. when we run test for RDM.. It will be a Reliable-unconnected connection. That means there will be connectRequest/ConnectReply , just a direct send?
When using ofi_rxm, we see these connections requests.. is this because it is RDM over MSG?





From: Xiong, Jianxin <jianxin.xiong at intel.com<mailto:jianxin.xiong at intel.com>>
Date: Tuesday, June 25, 2024 at 6:13 PM
To: Niyaz Murshed <Niyaz.Murshed at arm.com<mailto:Niyaz.Murshed at arm.com>>, libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org> <libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>>
Cc: nd <nd at arm.com<mailto:nd at arm.com>>
Subject: RE: questions on erbs provider for RDMA
That’s right. The RDM support in early versions of the verbs provider was separated to become the rxm provider. Today, verbs+ofi_rxm is the preferred choice for RDMA.

From: Niyaz Murshed <Niyaz.Murshed at arm.com<mailto:Niyaz.Murshed at arm.com>>
Sent: Tuesday, June 25, 2024 4:08 PM
To: Xiong, Jianxin <jianxin.xiong at intel.com<mailto:jianxin.xiong at intel.com>>; libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>
Cc: nd <nd at arm.com<mailto:nd at arm.com>>
Subject: Re: questions on erbs provider for RDMA

Thanks a lot. I can see the RRoCE packets now with rxm.

In fi_verbs(7) (ofiwg.github.io)<https://ofiwg.github.io/libfabric/v1.6.1/man/fi_verbs.7.html>,
New change in libfabric v1.6: FI_EP_RDM is supported through the OFI RxM utility provider. This is done automatically when the app requests FI_EP_RDM endpoint. Please refer the man page for RxM provider to learn more. The provider’s internal support for RDM endpoints is deprecated and would be removed from libfabric v1.7 onwards.

Does this mean that RDM will not be available with Verbs anymore ?
Can we assume that RXM is now the preferred method for RDMA applications?

Regards,
Niyaz

From: Xiong, Jianxin <jianxin.xiong at intel.com<mailto:jianxin.xiong at intel.com>>
Date: Tuesday, June 25, 2024 at 5:55 PM
To: Niyaz Murshed <Niyaz.Murshed at arm.com<mailto:Niyaz.Murshed at arm.com>>, libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org> <libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>>
Cc: nd <nd at arm.com<mailto:nd at arm.com>>
Subject: RE: questions on erbs provider for RDMA
You explicitly asked for dgram (the domain name ‘mlx5_1-dgram’ only support FI_EP_DGRAM). Use “-d mlx5_1” instead to enable rxm.

From: Niyaz Murshed <Niyaz.Murshed at arm.com<mailto:Niyaz.Murshed at arm.com>>
Sent: Tuesday, June 25, 2024 3:22 PM
To: Xiong, Jianxin <jianxin.xiong at intel.com<mailto:jianxin.xiong at intel.com>>; libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>
Cc: nd <nd at arm.com<mailto:nd at arm.com>>
Subject: Re: questions on erbs provider for RDMA

Thank you Jianxin for the reply. Really appreciate it.

I ran a test below and took packet capture.

S: fi_rdm_tagged_bw -s   192.168.1.100 -d mlx5_1-dgram -I 1 -S 1024

C: fi_rdm_tagged_bw -s   192.168.1.200   192.168.1.100 -d  mlx5_3-dgram  -I 1 -S 1024




The protocol I see is RoCE (not RRoCE) .. Is this expected?pcap attached.
Also.. there is no connection messages anymore, just Send Only. Looking at https://www.youtube.com/watch?v=8Cp2KBS4Q4g&t=412s , am assuming that is expected? All communications are done over “send” ?

In some servers, I don’t see ofi_rxm, only ofi_rxd .. is there any additional dependencies required for rxm ?

From: Xiong, Jianxin <jianxin.xiong at intel.com<mailto:jianxin.xiong at intel.com>>
Date: Tuesday, June 25, 2024 at 4:48 PM
To: Niyaz Murshed <Niyaz.Murshed at arm.com<mailto:Niyaz.Murshed at arm.com>>, libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org> <libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>>
Cc: nd <nd at arm.com<mailto:nd at arm.com>>
Subject: RE: questions on erbs provider for RDMA
All three have Verbs underneath and work over IB or RoCE network. Which provider to use depends on the OFI endpoint type requested by the application.

The verbs provider supports FI_EP_MSG and FI_EP_DGRAM. FI_EP_MSG maps directly to Verbs RC. Each endpoint maps to an RC QP. Explicit connection setup is needed before communication can happen. Multiple endpoints are needed in order to talk to different peers. FI_EP_DGRAM maps directly to Verbs UD. It’s connectionless, but is unreliable, and message size is limited by MTU size.

Ofi_rxm is a utility provider that runs on top of the verbs provider (using FI_EP_MSG type) and provide connectionless semantics (FI_EP_RDM). Under the cover, each ofi_rxm endpoint maps to verb endpoints and connection is established automatically on demand. You don’t need to ask for the combination explicitly.  When the application asks for “FI_EP_RDM” endpoint type, “verbs;ofi_rxm” is selected automatically.

Ofi_rxd is a utility provider that runs on top the verbs provider (using FI_EP_DGRAM type) and present FI_EP_RDM support.  Its functionality is limited compared to ofi_rxm so it is usually not the first choice.

In summary, an application only needs to ask for the “verbs” provider for RDMA. If an application wants to manage the connection setup by itself, it can ask for ep_type FI_EP_MSG and get the “bare” verbs provider. The application is then responsible to setup up the connections between endpoints by calling fi_passive_ep(), fi_listen(), fi_connect(), and fi_accept(). If an application doesn’t want to manage connection setup, it can ask for ep_type FI_EP_RDM and get the “verbs;ofi_rxm” provider. The application then needs to obtain the endpoint address with fi_getname(), exchange the addresses with peers using out-of-band mechanism, and insert the addresses into an address vector. Future communication will use addresses from the address vector as the identifier for the peer.

-Jianxin

From: Libfabric-users <libfabric-users-bounces at lists.openfabrics.org<mailto:libfabric-users-bounces at lists.openfabrics.org>> On Behalf Of Niyaz Murshed
Sent: Tuesday, June 25, 2024 2:08 PM
To: libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>
Cc: nd <nd at arm.com<mailto:nd at arm.com>>
Subject: [libfabric-users] questions on erbs provider for RDMA

Hi all ,

I am trying to understand the verbs provider specially difference between the below:



  1.  Verbs
  2.  Verbs;ofi_rxd
  3.  Verbs;ofi_rxm


They seem to work on 3 different protocols




FI_PROTO_RDMA_CM_IB_RC
The protocol runs over Infiniband reliable-connected queue pairs, using the RDMA CM protocol for connection establishment.


FI_PROTO_RXM
Reliable-datagram protocol implemented over message endpoints. RXM is a libfabric utility component that adds RDM endpoint semantics over MSG endpoint semantics.
FI_PROTO_RXD
Reliable-datagram protocol implemented over datagram endpoints. RXD is a libfabric utility component that adds RDM endpoint semantics over DGRAM endpoint semantics.




From my test of FI_PROTO_RDMA_CM_IB_RC, I see in wireshark that, its RoCEv2 protocol when I test application of RDMA.
What protocol do RXM and RXD use? I see its TCP packets on the wire? Does it mean it uses TCP ?
Is it possible to use RoCEv2 for RXM and RXD?
What is the best provider to use for RDMA ?

Regards,
Niyaz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20240626/44a1b6a3/attachment-0001.htm>


More information about the Libfabric-users mailing list