uDAPL Problem : [WasRe: [openib-general] OpenSM crash with today's trunk

Arlin Davis ardavis at ichips.intel.com
Fri Oct 28 16:07:56 PDT 2005


Aniruddha Bohra wrote:

>
> Now, I have a problem with udapl :
>
> The following is a code snippet from :
> dapl_ib_dto.h
>
> for (i = 0; i < segments; i++ ) {
>                if ( !local_iov[i].segment_length )
>                        continue;
>
>                ds_array_p->addr  = (uint64_t) 
> local_iov[i].virtual_address;
>                ds_array_p->length = local_iov[i].segment_length;
>                ds_array_p->lkey  = local_iov[i].lmr_context;
>
>                dapl_dbg_log (  DAPL_DBG_TYPE_EP,
>                                " post_snd: lkey 0x%x va %p len %d \n",
>                                ds_array_p->lkey, ds_array_p->addr,
>                                ds_array_p->length );
>
>                total_len += ds_array_p->length;
>                wr.num_sge++;
>                ds_array_p++;
>        }
>
> The following is the relevant part of the log with DAPL_DBG_TYPE=0xffff
>
> dapl_ep_post_send (0x8087110, 2, 0x80f9910, %P, b5f395bc)^M
> post_snd: ep 0x8087110 op 2 ck 0x8087374 sgs 2 l_iov 0x80f9910 r_iov 
> 0xbfc29060 f 0^M
> post_snd: ep 0x8087110 cookie 0x8087374 segs 2 l_iov 0x80f9910^M
> post_snd: lkey 0x10de003b va 0xb5f3976c len 0 ^M
> post_snd: lkey 0x10de003b va 0xb5f39924 len 0 ^M
>                                                                     
> ^^^^^^^^
>
> From the above loop, how is this possible :
> If local_iov[i].segment_length == 0, it should not be printed. And the
> if the assignment is successful, len must not be 0.
>
> Any ideas? Of course following this, the ep is disconnected in the 
> next step :(

local_iov (LMR) length is 64bits and the ibv_sge (ds_array) length is 32 
bits so it truncates.
Sounds like you setup a transfer greater then 4GB-1?

 If you query the device via uDAPL you will see the max limits (2GB):

 query_hca: (a0.0) ep 64512 ep_q 65535 evd 65408 evd_q 131071
 query_hca: msg 2147483648 rdma 2147483648 iov 59 lmr 131056 rmr 0

-arlin

>
> Also a minor patch, you can see that %P is printed as %P and not used as
> a format character.
>
> Index: common/dapl_ep_post_rdma_write.c
> ===================================================================
> --- common/dapl_ep_post_rdma_write.c    (revision 3892)
> +++ common/dapl_ep_post_rdma_write.c    (working copy)
> @@ -78,7 +78,7 @@
>     DAT_RETURN         dat_status;
>
>     dapl_dbg_log (DAPL_DBG_TYPE_API,
> -                 "dapl_ep_post_rdma_write (%p, %d, %p, %P, %p, %x)\n",
> +                 "dapl_ep_post_rdma_write (%p, %d, %p, %p, %p, %x)\n",
>                  ep_handle,
>                  num_segments,
>                  local_iov,
> Index: common/dapl_ep_post_send.c
> ===================================================================
> --- common/dapl_ep_post_send.c  (revision 3892)
> +++ common/dapl_ep_post_send.c  (working copy)
> @@ -75,7 +75,7 @@
>     DAT_RETURN         dat_status;
>
>     dapl_dbg_log (DAPL_DBG_TYPE_API,
> -                 "dapl_ep_post_send (%p, %d, %p, %P, %x)\n",
> +                 "dapl_ep_post_send (%p, %d, %p, %p, %x)\n",
>                  ep_handle,
>                  num_segments,
>                  local_iov,
> Index: common/dapl_srq_post_recv.c
> ===================================================================
> --- common/dapl_srq_post_recv.c (revision 3892)
> +++ common/dapl_srq_post_recv.c (working copy)
> @@ -79,7 +79,7 @@
>     DAT_RETURN         dat_status;
>
>     dapl_dbg_log (DAPL_DBG_TYPE_API,
> -                 "dapl_srq_post_recv (%p, %d, %p, %P)\n",
> +                 "dapl_srq_post_recv (%p, %d, %p, %p)\n",
>                  srq_handle,
>                  num_segments,
>                  local_iov,
> Index: common/dapl_ep_post_recv.c
> ===================================================================
> --- common/dapl_ep_post_recv.c  (revision 3892)
> +++ common/dapl_ep_post_recv.c  (working copy)
> @@ -79,7 +79,7 @@
>     DAT_RETURN         dat_status;
>
>     dapl_dbg_log (DAPL_DBG_TYPE_API,
> -                 "dapl_ep_post_recv (%p, %d, %p, %P, %x)\n",
> +                 "dapl_ep_post_recv (%p, %d, %p, %p, %x)\n",
>                  ep_handle,
>                  num_segments,
>                  local_iov,
>
> Thanks
> Aniruddha
>
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
>




More information about the general mailing list