[openib-general] Re: uDAPL again

Arlin Davis ardavis at ichips.intel.com
Wed Nov 2 14:08:21 PST 2005


Aniruddha Bohra wrote:

> Arlin Davis wrote:
>
>> Aniruddha Bohra wrote:
>>
>>> cq_object_wait: RET evd 0x8083ca0 ibv_cq 0x8083da0 ibv_ctx (nil) 
>>> Success^M
>>>         >>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<^M
>>>         dapl_evd_dto_callback : CQE ^M
>>>                 work_req_id 134771572^M
>>>                 status 12^M
>>>         >>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<^M
>>> DTO completion ERROR: 12: op 0xff^M
>>> disconnect(ep 0x8087110, conn 0x808a008, id 134774528 flags 0)^M
>>> destroy_cm_id: conn 0x808a008 id 134774528^M
>>> dapli_evd_post_event: Called with event # 4006^M
>>>
>>>
>>> Any ideas how to proceed to even debug this ?
>>
>> Are you using the uDAPL provider with socket CM (VERBS=openib_scm) or 
>> the default one that use's uCM and uAT?  For the socket_CM version 
>> the timeout is set to 14 (~67ms) and the retries are set to 7 so the 
>> receiving node would have to be delayed beyond ~469ms to get this 
>> failure. For the default uCM/uAT version the retries are set to 7 and 
>> the timeout is set to pktlifetime+1 so you would have to look at the 
>> path-record for the timeout value for the connection.
>>
> I am using the default one. Actually, even the dapl_ep_connect() takes 
> a long time.

How long does it typically take to process your dapl_ep_connect? Your 
time is most likely being spent resolving the remote IP address to a GID 
and then resolving the path record. Both require SA quieries.

> I am not sure, but arent uCM and uAT simply for connection establishment?
>
Yes, but they also set up many of the transfer attributes of the 
connected QP. The uCM/uAT version uses path_records from the SA query 
but the socket_CM version just builds them by hand similiar to the way 
ibv_rc_pingpong does. You would have to look at the 
pathrecord->pktlifetime to see the actual timeout value being used.

>
>> Can you successfully run the IB verbs ibv_rc_pingpong test suite?  
>
>
> Between the two OpenIB nodes, I can run the ibv_rc_pingpong.

I would suggest that you try the socket CM version and see if you get 
different results. Just build with "make VERBS=openib_scm".

-arlin




More information about the general mailing list