[ofa-general] Multiports single HCA uDAPL program problem
Davis, Arlin R
arlin.r.davis at intel.com
Mon Feb 2 10:01:32 PST 2009
>One more problem happened when trying to establish 1 connection per
>rail, as illustrated
>in the graph.
>
> node0 node1
>rail0: psp0 <----------------> ep0 (port 0 on hca)
>rail1: psp1 <----------------> ep1 (port 1 on hca)
>
>rail0 got connected first and connection are always stable and correct.
>However rail1 sometime connected properly sometime doesn't.
>Following is the error message:
>
>11836 Waiting for connect response
>11836 Error unexpected conn event :
>DAT_CONNECTION_EVENT_NON_PEER_REJECTED
>11836 Error connect_ep: DAT_ABORT
>
>The program establishes the connection for both rail exactly the same.
>What may caused this?
rdma_cm is rejecting the connect request. Turn on warnings for more information:
export DAPL_DBG_TYPE=0x0003
-arlin
More information about the general
mailing list