[ewg] [PATCH 0/8 v3] RDMAoE support
Eli Cohen
eli at dev.mellanox.co.il
Mon Jul 13 11:13:10 PDT 2009
RDMA over Ethernet (RDMAoE) allows running the IB transport protocol
using Ethernet frames allowing the deployment of IB semantics on
lossless Ethernet fabrics. RDMAoE packets are standard Ethernet frames
with an IEEE assigned Ethertype, a GRH, unmodified IB transport
headers and payload. Aside from the considerations pointed out below,
RDMAoE ports are functionally equivalent to regular IB ports from the
RDMA stack perspective.
IB subnet management and SA services are not required for RDMAoE
operation; Ethernet management practices are used instead. In
Ethernet, nodes are commonly referred to by applications by means of
an IP address. RDMAoE encodes the IP addresses that were assigned to
the corresponding Ethernet port into its GIDs, and makes use of the IP
stack to bind a destination address to the corresponding netdevice
(just as the CMA does today for IB and iWARP) and to obtain its L2 MAC
addresses.
The RDMA Verbs API is syntactically unmodified. When referring to
RDMAoE ports, Address handles are required to contain GIDs and the L2
address fields in the API are ignored. The Ethernet L2 information is
then obtained by the vendor-specific driver (both in kernel- and
user-space) while modifying QPs to RTR and creating address handles.
In order to maximize transparency for applications, RDMAoE implements
a dedicated API that provides services equivalent to some of those
provided by the IB-SA. The current approach is strictly local but may
evolve in the future. This API is implemented using an independent
source code file which allows for seamless evolution of the code
without affecting the IB native SA interfaces. We have successfully
tested MPI, SDP, RDS, and native Verbs applications over RDMAoE.
To enable RDMAoE with the mlx4 driver stack, both the mlx4_en and
mlx4_ib drivers must be loaded, and the netdevice for the
corresponding RDMAoE port must be running. Individual ports of a multi
port HCA can be independently configured as Ethernet (with support for
RDMAoE) or IB, as is already the case.
Following is a series of 8 patches based on version 2.6.30 of the
Linux kernel. This new series reflects changes based on feedback from
the community on the previous set of patches. The whole series is
tagged v3.
Signed-off-by: Eli Cohen <eli at mellanox.co.il>
drivers/infiniband/core/Makefile | 2
drivers/infiniband/core/addr.c | 20
drivers/infiniband/core/agent.c | 12
drivers/infiniband/core/cma.c | 124 +++
drivers/infiniband/core/mad.c | 48 +
drivers/infiniband/core/multicast.c | 43 -
drivers/infiniband/core/multicast.h | 79 ++
drivers/infiniband/core/rdmaoe_sa.c | 942 ++++++++++++++++++++++++++++++
drivers/infiniband/core/sa.h | 24
drivers/infiniband/core/sa_query.c | 26
drivers/infiniband/core/ud_header.c | 111 +++
drivers/infiniband/core/uverbs.h | 1
drivers/infiniband/core/uverbs_cmd.c | 33 +
drivers/infiniband/core/uverbs_main.c | 1
drivers/infiniband/core/verbs.c | 17
drivers/infiniband/hw/mlx4/ah.c | 228 ++++++-
drivers/infiniband/hw/mlx4/main.c | 276 +++++++-
drivers/infiniband/hw/mlx4/mlx4_ib.h | 30
drivers/infiniband/hw/mlx4/qp.c | 253 ++++++--
drivers/infiniband/ulp/ipoib/ipoib_main.c | 3
drivers/net/mlx4/cmd.c | 6
drivers/net/mlx4/en_main.c | 15
drivers/net/mlx4/en_port.c | 4
drivers/net/mlx4/en_port.h | 3
drivers/net/mlx4/intf.c | 20
drivers/net/mlx4/main.c | 6
drivers/net/mlx4/mlx4.h | 1
include/linux/mlx4/cmd.h | 1
include/linux/mlx4/device.h | 31
include/linux/mlx4/driver.h | 16
include/linux/mlx4/qp.h | 8
include/rdma/ib_addr.h | 53 +
include/rdma/ib_pack.h | 26
include/rdma/ib_user_verbs.h | 21
include/rdma/ib_verbs.h | 22
include/rdma/rdmaoe_sa.h | 66 ++
36 files changed, 2333 insertions(+), 239 deletions(-)
More information about the ewg
mailing list