[ofa-general] How to establish IB communcation more effectively?

Or Gerlitz or.gerlitz at gmail.com
Tue May 12 14:50:02 PDT 2009


Davis, Arlin R <arlin.r.davis at intel.com> wrote:
>>Just to make sure we're on the same page: both IPoIB and the RDMA-CM
>>use SA path queries (ipoib for the unicast arp reply, and rdma-cm for
>>rdma_resolve_route), going into details, things look like:

> I am running IPoIB connected so I assume there is no path query
> and I see no difference in IPoIB unconnected mode so I also assume
> it caches path records during ARP processing. Can someone confirm?

Arlin,

Both the datagram and connected mode issue path query (its the way IB
works). The datagram mode uses the IB UD (Unreliable Datagram)
transport and once the path is resolve it creates IB AH (Address
Handle) which is used in conjunction with the UD QP. The connected
mode uses the IB RC (Reliable Connection) transport, so path info is
used to establish it connection through the IB CM.

> ARP cache is also hit in all these cases so you can take ARP request/reply out.

I am not with you: by "ARP cache" I assume you refer to the networking
stack neighbour table, correct? so this cache has the entries since
the IPoIB network was also used to spawn the job?

> However, with rdma_cm we actually have to pick up the ADDR_RESOLVED (arp)
> event before moving on to the rdma_resolve_route (path record), and then wait for
> ROUTE_RESOLVED event before moving on to the rdma_connect call, and then
> finally wait for ESTABLISHED. You start to get the picture of where my time goes? > Not only do we have path record query delays we have a 3 step event
> processing (waiting/waking on each) just to get connected.

Yes, this sounds like a potentially big difference from the TCP case,
lets see how many kernel --> user events we have in both methods --

rdma-cm active side
-----------------------
addr-resolved
route-resolved
established

rdma-cm passive side
--------------------------
connection-request
established

scm active side
------------------
connected

scm passive side
--------------------
connection request
connected

in the rdma-cm framework there are three kernel -->user
transitions/events for the active and two for the passive, where in
the scm framework there are two for the passive but only one for the
active. Also counting user --> kernel transitions, in the rdma-cm
active side there are three vs only one in the scm. This sounds like
where things would probably makes a difference. I believe it could be
fairly easy to have the kernel rdma ucm module do two successive calls
(route resolve and connect) once the local address is resolved, since
at that point the user space consumer can create their QP, etc.

> Not only do we have path record query delays

So we agree that its path query --delays-- and for one rank per node
its the same # of path queries? (Sean)

Or.



More information about the general mailing list