[openib-general] [PATCH] Re: uDAPL again

Arlin Davis ardavis at ichips.intel.com
Thu Nov 3 09:51:14 PST 2005


Arlin Davis wrote:

> Aniruddha Bohra wrote:
>
>> I am not sure, but arent uCM and uAT simply for connection 
>> establishment?
>>
> Yes, but they also set up many of the transfer attributes of the 
> connected QP. The uCM/uAT version uses path_records from the SA query 
> but the socket_CM version just builds them by hand similiar to the way 
> ibv_rc_pingpong does. You would have to look at the 
> pathrecord->pktlifetime to see the actual timeout value being used.
>
Ok, I added some debug and it looks like the path record returned from 
uAT looks suspect. Here are the results from tuAT and opensm running on 
my cluster. Path record pktlife is 0 (uCM adds 1) so the ACK timeout 
value for this connection will be very short.

 path_comp_handler: ctxt 0x525fa0, req_id 90 rec_num 1
 path_comp_handler: SRC GID subnet fe80000000000000 id 0002c9020000409d
 path_comp_handler: DST GID subnet fe80000000000000 id 0002c90200004071
 path_comp_handler: slid 5 dlid 2 mtu 120203(2) pktlife 
0(0)                   <<<  ???
 path_comp_handler: hops 0 npaths 0 pkey ffff tclass 0 rate 
0(0)              <<<  ???

Hal, can you take a look at uAT and see if the copy to user space is 
working correctly.

Aniruddha, can you apply the following patch and send us the output from 
your run?

-arlin

Signed-off by: Arlin Davis <ardavis at ichips.intel.com>

Index: dapl/openib/dapl_ib_cm.c
===================================================================
--- dapl/openib/dapl_ib_cm.c    (revision 3951)
+++ dapl/openib/dapl_ib_cm.c    (working copy)
@@ -136,14 +136,27 @@

        dapl_dbg_log(DAPL_DBG_TYPE_CM,
                " path_comp_handler: SRC GID subnet %016llx id %016llx\n",
-               (unsigned long 
long)cpu_to_be64(conn->dapl_rt.sgid.global.subnet_prefix),
-               (unsigned long 
long)cpu_to_be64(conn->dapl_rt.sgid.global.interface_id) );
+               (unsigned long 
long)cpu_to_be64(conn->dapl_path.sgid.global.subnet_prefix),
+               (unsigned long 
long)cpu_to_be64(conn->dapl_path.sgid.global.interface_id) );

        dapl_dbg_log(DAPL_DBG_TYPE_CM,
                " path_comp_handler: DST GID subnet %016llx id %016llx\n",
-               (unsigned long 
long)cpu_to_be64(conn->dapl_rt.dgid.global.subnet_prefix),
-               (unsigned long 
long)cpu_to_be64(conn->dapl_rt.dgid.global.interface_id) );
+               (unsigned long 
long)cpu_to_be64(conn->dapl_path.dgid.global.subnet_prefix),
+               (unsigned long 
long)cpu_to_be64(conn->dapl_path.dgid.global.interface_id) );

+       dapl_dbg_log(DAPL_DBG_TYPE_CM,
+               " path_comp_handler: slid %x dlid %x mtu %x(%x) pktlife 
%x(%x)\n",
+               ntohs(conn->dapl_path.slid), ntohs(conn->dapl_path.dlid),
+               conn->dapl_path.mtu, conn->dapl_path.mtu_selector,
+               conn->dapl_path.packet_life_time,
+               conn->dapl_path.packet_life_time_selector );
+
+       dapl_dbg_log(DAPL_DBG_TYPE_CM,
+               " path_comp_handler: hops %x npaths %x pkey %x tclass %x 
rate %x(%x)\n",
+               conn->dapl_path.hop_limit, conn->dapl_path.numb_path,
+               conn->dapl_path.pkey, conn->dapl_path.traffic_class,
+               conn->dapl_path.rate, conn->dapl_path.rate_selector);
+
        if (rec_num <= 0) {
                dapl_dbg_log(DAPL_DBG_TYPE_CM,
                             " path_comp_handler: ERR %d retry %d\n",






More information about the general mailing list