[ofa-general] How to establish IB communcation more effectively?

Davis, Arlin R arlin.r.davis at intel.com
Tue May 12 14:23:37 PDT 2009


 
>Davis, Arlin R <arlin.r.davis at intel.com> wrote:
>> For a connection (socket connect, exchanging QP info, 
>private data, qp modify)
>> using uDAPL socket cm versus rdma_cm I get:
>> socket_cm on 1Ge == ~900us
>> socket_cm on IPoIB (mlx4 ddr) == ~400us
>> rdma_cm on IB (mlx4 ddr) == ~2200us
>> As you can see, the path record queries via rdma_cm add a 
>substantial penalty.
>
>Hi Arlin,
>
>Just to make sure we're on the same page: both IPoIB and the RDMA-CM
>use SA path queries (ipoib for the unicast arp reply, and rdma-cm for
>rdma_resolve_route), going into details, things look like:

I am running IPoIB connected so I assume there is no path query 
and I see no difference in IPoIB unconnected mode so I also assume 
it caches path records during ARP processing. Can someone confirm? 

ARP cache is also hit in all these cases so you can take 
ARP request/reply out. However, with rdma_cm we actually 
have to pick up the RDMA_CM_EVENT_ADDR_RESOLVED (arp) event 
before moving on to the rdma_resolve_route (path record), 
and then wait for RDMA_CM_EVENT_ROUTE_RESOLVED event 
before moving on to the rdma_connect call, and then 
finally wait for RDMA_CM_EVENT_ESTABLISHED. You start
to get the picture of where my time goes? Not only do 
we have path record query delays we have a 3 step event 
processing (waiting/waking on each) just to get connected.

My measurements are on top of uDAPL so everything is equal.
I simply added some timers to dtest around connect and 
wait for connection event:

start_timer
dat_ep_connect()
dat_evd_wait()
stop_timer
	
For example (client side):
		
eth0 socket_cm:  dtest -P ofa-v2-mlx4_0-1 -h cst-55-eth0 -t 
IPoIB socket_cm: dtest -P ofa-v2-mlx4_0-1 -h cst-55-ib0 -t
rdma_cm:         dtest -P ofa-v2-ib0 -h cst-55-ib0 -t


-arlin


More information about the general mailing list