[ofa-general] [PATCHv5 0/10] RDMAoE support

Eli Cohen eli at dev.mellanox.co.il
Mon Aug 24 05:13:07 PDT 2009


Roland,

what about this series of patches? Would you like me to re-create them
over your xrc branch or would you rather take them before xrc?

On Wed, Aug 19, 2009 at 08:19:35PM +0300, Eli Cohen wrote:
> RDMA over Ethernet (RDMAoE) allows running the IB transport protocol using
> Ethernet frames, enabling the deployment of IB semantics on lossless Ethernet
> fabrics. RDMAoE packets are standard Ethernet frames with an IEEE assigned
> Ethertype, a GRH, unmodified IB transport headers and payload.  IB subnet
> management and SA services are not required for RDMAoE operation; Ethernet
> management practices are used instead. RDMAoE encodes IP addresses into its
> GIDs and resolves MAC addresses using the host IP stack. For multicast GIDs,
> standard IP to MAC mappings apply.
> 
> To support RDMAoE, a new transport protocol was added to the IB core. An RDMA
> device can have ports with different transports, which are identified by a port
> transport attribute.  The RDMA Verbs API is syntactically unmodified. When
> referring to RDMAoE ports, Address handles are required to contain GIDs while
> LID fields are ignored. The Ethernet L2 information is subsequently obtained by
> the vendor-specific driver (both in kernel- and user-space) while modifying QPs
> to RTR and creating address handles.  As there is no SA in RDMAoE, the CMA code
> is modified to fill the necessary path record attributes locally before sending
> CM packets. Similarly, the CMA provides to the user the required address handle
> attributes when processing SIDR requests and joining multicast groups.
> 
> In this patch set, an RDMAoE port is currently assigned a single GID, encoding
> the IPv6 link-local address of the corresponding netdev; the CMA RDMAoE code
> temporarily uses IPv6 link-local addresses as GIDs instead of the IP address
> provided by the user, thereby supporting any IP address.
> 
> To enable RDMAoE with the mlx4 driver stack, both the mlx4_en and mlx4_ib
> drivers must be loaded, and the netdevice for the corresponding RDMAoE port
> must be running. Individual ports of a multi port HCA can be independently
> configured as Ethernet (with support for RDMAoE) or IB, as is already the case.
> We have successfully tested MPI, SDP, RDS, and native Verbs applications over
> RDMAoE.
> 
> Following is a series of 10 patches based on version 2.6.30 of the Linux
> kernel. This new series reflects changes based on feedback from the community
> on the previous set of patches, and is tagged v5.
> 
> Changes from v4:
> 1. Added rdma_is_transport_supported() and used it to simplify conditionals
> throughout the code.
> 2. ib_register_mad_agent()for QP0 is only called for IB ports 3. PATCH 5/10
> changed from "Enable support for RDMAoE ports" to "Enable support only for IB
> ports".
> 4. MAD services from userspace currently not supported for RDMAoE ports.
> 5. Add kref to struct cma_multicast to aid in maintaining reference count on
> the object. This is to avoid freeing the object while the worker thread is
> still using it.
> 6. Return immediate error for invalid MTU when resolving an RDMAoE path 7.
> Don't fail resolve path if rate is 0 since this value stands for
> IB_RATE_PORT_CURRENT.
> 8. In cma_rdmaoe_join_multicast(), fail immediately if mtu is zero.
> 9. Add ucma_copy_rdmaoe_route()instead of modifying ucma_copy_ib_route().
> 10. Bug fix: in PATCH 10/10, call flush_workqueue after unregistering netdev
> notifiers
> 11. Multicast no longer use the broadcast MAC.
> 12. No changes to patches 2, 7 and 8 from the v4 series.
> 
> Signed-off-by: Eli Cohen <eli at mellanox.co.il>
> ---
> 
>  b/drivers/infiniband/core/agent.c           |   38 ++-
>  b/drivers/infiniband/core/cm.c              |   25 +-
>  b/drivers/infiniband/core/cma.c             |   54 ++--
>  b/drivers/infiniband/core/mad.c             |   41 ++-
>  b/drivers/infiniband/core/multicast.c       |    4 
>  b/drivers/infiniband/core/sa_query.c        |   39 ++-
>  b/drivers/infiniband/core/ucm.c             |    8 
>  b/drivers/infiniband/core/ucma.c            |    2 
>  b/drivers/infiniband/core/ud_header.c       |  111 ++++++++++
>  b/drivers/infiniband/core/user_mad.c        |    6 
>  b/drivers/infiniband/core/uverbs.h          |    1 
>  b/drivers/infiniband/core/uverbs_cmd.c      |   32 ++
>  b/drivers/infiniband/core/uverbs_main.c     |    1 
>  b/drivers/infiniband/core/verbs.c           |   25 ++
>  b/drivers/infiniband/hw/mlx4/ah.c           |  187 +++++++++++++---
>  b/drivers/infiniband/hw/mlx4/mad.c          |   32 +-
>  b/drivers/infiniband/hw/mlx4/main.c         |  309 +++++++++++++++++++++++++---
>  b/drivers/infiniband/hw/mlx4/mlx4_ib.h      |   19 +
>  b/drivers/infiniband/hw/mlx4/qp.c           |  172 ++++++++++-----
>  b/drivers/infiniband/ulp/ipoib/ipoib_main.c |   12 -
>  b/drivers/net/mlx4/en_main.c                |   15 +
>  b/drivers/net/mlx4/en_port.c                |    4 
>  b/drivers/net/mlx4/en_port.h                |    3 
>  b/drivers/net/mlx4/fw.c                     |    3 
>  b/drivers/net/mlx4/intf.c                   |   20 +
>  b/drivers/net/mlx4/main.c                   |    6 
>  b/drivers/net/mlx4/mlx4.h                   |    1 
>  b/include/linux/mlx4/cmd.h                  |    1 
>  b/include/linux/mlx4/device.h               |   31 ++
>  b/include/linux/mlx4/driver.h               |   16 +
>  b/include/linux/mlx4/qp.h                   |    8 
>  b/include/rdma/ib_addr.h                    |   92 ++++++++
>  b/include/rdma/ib_pack.h                    |   26 ++
>  b/include/rdma/ib_user_verbs.h              |   21 +
>  b/include/rdma/ib_verbs.h                   |   11 
>  b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c   |    3 
>  b/net/sunrpc/xprtrdma/svc_rdma_transport.c  |    2 
>  drivers/infiniband/core/cm.c                |    5 
>  drivers/infiniband/core/cma.c               |  207 ++++++++++++++++++
>  drivers/infiniband/core/mad.c               |   37 ++-
>  drivers/infiniband/core/ucm.c               |   12 -
>  drivers/infiniband/core/ucma.c              |   31 ++
>  drivers/infiniband/core/user_mad.c          |   15 -
>  drivers/infiniband/core/verbs.c             |   10 
>  include/rdma/ib_verbs.h                     |   15 +
>  45 files changed, 1440 insertions(+), 273 deletions(-)
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list