[ewg] [PATCH 0/9] RDMAoE - RDMA over Ethernet

Eli Cohen eli at dev.mellanox.co.il
Mon Jun 15 00:34:08 PDT 2009


RDMA over Ethernet (RDMAoE) allows running the IB transport protocol over
Ethernet, providing IB capabilities for Ethernet fabrics. The packets are
standard Ethernet frames with an Ethertype, an IB GRH,  unmodified IB transport
headers and payload. HCA RDMAoE ports are no different than regular IB ports
from the RDMA stack perspective.
    
IB subnet management and SA services are not required for RDMAoE operation;
Ethernet management practices are used instead. In Ethernet, nodes are commonly
referred to by applications by means of an IP address. RDMAoE treats IP
addresses that were assigned to the corresponding Ethernet port as GIDs, and
makes use of the IP stack to bind a destination address to the corresponding
netdevice (just as the CMA does today for IB and iWARP) and to obtain its L2
MAC addresses.
    
The RDMA Verbs API is syntactically unmodified. When referring to RDMAoE ports,
Address handles are required to contain GIDs and the L2 address fields in the
API are ignored. The Ethernet L2 information is then obtained by the
vendor-specific driver (both in kernel- and user-space) while modifying QPs to
RTR and creating address handles.
    
In order to maintain application compatibility, RDMAoE implements a SA_Query
API that locally returns path records with the corresponding GIDs and the other
relevant parameters . Consequently, any CMA or native Verbs application, in
kernel or user-space, that uses path queries to obtain its address information,
will run transparently over RDMAoE with no changes. We have successfully tested
MPI, SDP, RDS, and native Verbs applications over RDMAoE without *any* changes.
    
In the mlx4 driver stack, mlx4_en must be loaded and the corresponding eth
Ethernet (with support for RDMAoE) or IB, as it was already the case.

Following is a series of 9 patches based on version 2.6.30 of the
Linux kernel.

Signed-off-by: Eli Cohen <eli at mellanox.co.il>
    

 drivers/infiniband/core/addr.c            |   20 ++-
 drivers/infiniband/core/agent.c           |   16 +-
 drivers/infiniband/core/cma.c             |   39 ++++-
 drivers/infiniband/core/mad.c             |   48 ++++--
 drivers/infiniband/core/multicast.c       |  153 ++++++++++++++--
 drivers/infiniband/core/sa_query.c        |  167 ++++++++++++++----
 drivers/infiniband/core/ud_header.c       |  111 ++++++++++++
 drivers/infiniband/core/uverbs.h          |    1 +
 drivers/infiniband/core/uverbs_cmd.c      |   34 ++++
 drivers/infiniband/core/uverbs_main.c     |    1 +
 drivers/infiniband/core/verbs.c           |   18 ++
 drivers/infiniband/hw/mlx4/ah.c           |  228 ++++++++++++++++++++----
 drivers/infiniband/hw/mlx4/main.c         |  276 ++++++++++++++++++++++++++---
 drivers/infiniband/hw/mlx4/mlx4_ib.h      |   30 +++-
 drivers/infiniband/hw/mlx4/qp.c           |  253 +++++++++++++++++++++-----
 drivers/infiniband/ulp/ipoib/ipoib_main.c |    3 +
 drivers/net/mlx4/cmd.c                    |    6 +
 drivers/net/mlx4/en_main.c                |   15 ++-
 drivers/net/mlx4/en_port.c                |    4 +-
 drivers/net/mlx4/en_port.h                |    3 +-
 drivers/net/mlx4/intf.c                   |   20 ++
 drivers/net/mlx4/main.c                   |    6 +
 drivers/net/mlx4/mlx4.h                   |    1 +
 include/linux/mlx4/cmd.h                  |    1 +
 include/linux/mlx4/device.h               |   31 +++-
 include/linux/mlx4/driver.h               |   16 ++-
 include/linux/mlx4/qp.h                   |    8 +-
 include/rdma/ib_addr.h                    |   51 ++++++
 include/rdma/ib_pack.h                    |   26 +++
 include/rdma/ib_user_verbs.h              |   21 ++-
 include/rdma/ib_verbs.h                   |   22 +++
 31 files changed, 1419 insertions(+), 210 deletions(-)



More information about the ewg mailing list