[ofa-general] IPoIB CM connectivity issue.

Alex Estrin alex.estrin at qlogic.com
Thu Oct 2 17:19:24 PDT 2008


Hello,

It seem there is an problem with connection algorithm in ipoib cm.

In case if connection initiated by remote node(B), OFED node(A)
accepts connection and then sends it's own connect request to the same
node(B) using different RC QP!
Here is what I see on a wire:

A -> ARP REQ                     ->B
B -> CREQ                        ->A 
A -> CREP (local RC QPN1)        ->B 
B -> RTU                         ->A
B -> ARP REP                     ->A unicast packet delivered over RC QP
A -> CREQ (local RC QPN2)        ->B  !!!!!!!!!!!!!!!!!!!

How many RC queue pairs OFED nodes can use to connect to each other?

In case if connection initiated from OFED(A) FIRST  - node (B) accepts
request 
and connection stays alive until arp table entry(B) is expired and (A)
sends DREQ to (B).
Only one RC QP is used to communicate.

B -> ARP REQ                     ->A
A -> ARP REP                     ->B  !!(questionable. Please see note2
below)
A -> CREQ (local RC QPN1)        ->B 
B -> CREP                        ->A
A -> RTU                         ->B
....... 
.... followed data delivered over RC QP
....

Note1:
Initial assumption - in the beginning both hosts have clean ARP tables.
Note2:
It looks like violation of RFC4755 convention to send unicast data over
connected QP.

Please let me know if I missed anything.

Thanks,
Alex.



More information about the general mailing list