[ofa-general] Re: [PATCH 0/8 v3] RDMAoE support
Or Gerlitz
ogerlitz at voltaire.com
Wed Jul 15 01:29:11 PDT 2009
Liran Liss wrote:
> GRH is required for all packets; it keeps things simple and provides length information
yes, makes sense
> There is no reason to restrict RDMAoE to RDMACM apps: all existing IB applications can work over RDMAoE
Considering the fact that you've put away IPoIB and the non-existence of
user space SA queries, the only remaining potential user is SRP which
anyway uses sophisticated QP1 and SA services even before going to H.A
and multi-pathing - so I don't see anyone using SRP with RDMAoE. As for
MPIs which don't use rdma-cm, the reason for that has been the problem
or perception of SA scalability which doesn't exist over Ethernet.
So all-in-all (or maybe all-to-all...) the only remaining consumers are
those MPIs which will insist on exchanging <GID, MAC> pairs the way they
exchange <GID, LID> pairs today (note that with IB GRH isn't a must so
you should now go and patch all MPIs which don't exchange GIDs). These
guys will simply create Adress handles with the <GID, MAC> pairs, and as
such there's no need to have any layer below the rdma-cm and addr doing
MAC or GID resolution.
> we already have a great CM protocol that runs over QP1 and is a perfect fit for RDMAoE; we don't see any reason to change this
sounds good, lets see what the other reviewers think, so you didn't
change a bit in the CM?
> The rmdaoa "sa" code does more than just path resolution - it provides SL, MTU, rate, packet lifetime, and other params. The same thing is also true for multicast joins.
mmm, interesting, what is the criteria / logic to assign SL and rate?
looking in rdmaoe_sa.c I didn't find any sign to such code, please point
me to the function/s that implement this. This is before going to the
way you've implemented the multicast joins, which I'll get to later.
> In terms of the code, it is very simple: - move useful definitions that were previously in multicast.c/sa_query.c to header files. - implement only what is needed for rdmaoe in rdmaoe_sa.c. Perhaps we can make better code reuse (calling the SA's cmp_rec(), for example). We will look into it; thanks! I would just like to note that there is a tradeoff here
what ever, however, going your way (i.e claim that there's a need to
rdmaoe_sa.c, I still don't see why rdmaoe_sa.c is needed) I don't see
how it can be reviewed this since you made all the above changes in one
patch - if you want review, separate to at least two patches, the first
moves a way code, the second add the rdmaoe_sa.c.
BTW - same goes for the API changes, you should have one patch that
relates to kernel only changes, and another patch which relates to the
user space changes, and this patch should come in a series with changes
to libibverbs so people can review both sides of the kernel/user channel.
> Yes, we intend to use standard IPv4-mapped addresses (::0xffff<a.b.c.d>)
okay, any reason not to do this from day one? is there anyone in the IB
stack or in your firmware that assumes a specific GID would always be in
index 0?
> on the contrary, there is *nothing* in the architecture that is HW-specific. This is intentional --- SW rdmaoe will be straighforward, efficient (as is SW FCoE), and simple (this is rdma transport after all!) to implement over any nic. We already know of people that are looking into that.
I'm not convinced... e.g we'll get to this more when doing a detailed
review on the multicast thing. Also, I haven't seen any comment from
anyone on your API changes, can be nice if you act to make the sw rdmaoe
developers comment here.
Or.
More information about the general
mailing list