[libfabric-users] dpdk mlx5drv vs. openfabric

Hefty, Sean sean.hefty at intel.com
Tue Mar 28 09:38:48 PDT 2023


> Consider a Mellanox ConnectX5 NIC on Linux bare metal Xeon E-2378G 64Gb RAM. This
> configuration uses the mlx5 driver. DPDK can move ethernet UDP packets well under 1us
> each. Openfabric and libverb’s perftest seems to plateau around ~1.5 us/ea. Mlx5 can be
> a more performant than 100% classic ibverbs pass through by directly accessing
> hardware.
> 
> I’m new here. Is there some perspective out there on openfabric v. DPDK w.r.t. to
> performance and goals for its future?

DPDK targets a very different use case than libfabric.  DPDK focuses on *packet* processing -- think of implementing a router or firewall.  You're usually dealing with processes running in a privileged environment, and NICs configured for DPDK traffic frequently cannot be used outside of the DPDK environment (such as tcp/ip sockets).  I know this is the case for Intel NICs; I don't know about Nvidia NICs.

Libfabric is a communication API targeting HPC/AI middleware and storage applications.  It exposes transport level *message* semantics to applications, not raw packets.  It's hiding flow control, reliability, segmentation and reassembly of large messages, packet reordering, and other details.

Libfabric targets doing this at scale for large parallel applications.  So, 2 node ping-pong results are not a realistic measurement of application level performance.  With verbs devices, applications using class libibverbs APIs do have direct access to the hardware.  But apps commonly use reliable-connected queue pairs, which provides the transport features of reliability, packet ordering, segmentation and reassembly, etc.  Apps using UDP packets would need to implement all of those features themselves in SW if running over DPDK...

...unless the app doesn't need those features, which is true for DPDK apps that are simply trying to handle packets individually.

- Sean


More information about the Libfabric-users mailing list