uDAPL Problem : [WasRe: [openib-general] OpenSM crash with today's trunk
Arlin Davis
ardavis at ichips.intel.com
Fri Oct 28 16:07:56 PDT 2005
Aniruddha Bohra wrote:
>
> Now, I have a problem with udapl :
>
> The following is a code snippet from :
> dapl_ib_dto.h
>
> for (i = 0; i < segments; i++ ) {
> if ( !local_iov[i].segment_length )
> continue;
>
> ds_array_p->addr = (uint64_t)
> local_iov[i].virtual_address;
> ds_array_p->length = local_iov[i].segment_length;
> ds_array_p->lkey = local_iov[i].lmr_context;
>
> dapl_dbg_log ( DAPL_DBG_TYPE_EP,
> " post_snd: lkey 0x%x va %p len %d \n",
> ds_array_p->lkey, ds_array_p->addr,
> ds_array_p->length );
>
> total_len += ds_array_p->length;
> wr.num_sge++;
> ds_array_p++;
> }
>
> The following is the relevant part of the log with DAPL_DBG_TYPE=0xffff
>
> dapl_ep_post_send (0x8087110, 2, 0x80f9910, %P, b5f395bc)^M
> post_snd: ep 0x8087110 op 2 ck 0x8087374 sgs 2 l_iov 0x80f9910 r_iov
> 0xbfc29060 f 0^M
> post_snd: ep 0x8087110 cookie 0x8087374 segs 2 l_iov 0x80f9910^M
> post_snd: lkey 0x10de003b va 0xb5f3976c len 0 ^M
> post_snd: lkey 0x10de003b va 0xb5f39924 len 0 ^M
>
> ^^^^^^^^
>
> From the above loop, how is this possible :
> If local_iov[i].segment_length == 0, it should not be printed. And the
> if the assignment is successful, len must not be 0.
>
> Any ideas? Of course following this, the ep is disconnected in the
> next step :(
local_iov (LMR) length is 64bits and the ibv_sge (ds_array) length is 32
bits so it truncates.
Sounds like you setup a transfer greater then 4GB-1?
If you query the device via uDAPL you will see the max limits (2GB):
query_hca: (a0.0) ep 64512 ep_q 65535 evd 65408 evd_q 131071
query_hca: msg 2147483648 rdma 2147483648 iov 59 lmr 131056 rmr 0
-arlin
>
> Also a minor patch, you can see that %P is printed as %P and not used as
> a format character.
>
> Index: common/dapl_ep_post_rdma_write.c
> ===================================================================
> --- common/dapl_ep_post_rdma_write.c (revision 3892)
> +++ common/dapl_ep_post_rdma_write.c (working copy)
> @@ -78,7 +78,7 @@
> DAT_RETURN dat_status;
>
> dapl_dbg_log (DAPL_DBG_TYPE_API,
> - "dapl_ep_post_rdma_write (%p, %d, %p, %P, %p, %x)\n",
> + "dapl_ep_post_rdma_write (%p, %d, %p, %p, %p, %x)\n",
> ep_handle,
> num_segments,
> local_iov,
> Index: common/dapl_ep_post_send.c
> ===================================================================
> --- common/dapl_ep_post_send.c (revision 3892)
> +++ common/dapl_ep_post_send.c (working copy)
> @@ -75,7 +75,7 @@
> DAT_RETURN dat_status;
>
> dapl_dbg_log (DAPL_DBG_TYPE_API,
> - "dapl_ep_post_send (%p, %d, %p, %P, %x)\n",
> + "dapl_ep_post_send (%p, %d, %p, %p, %x)\n",
> ep_handle,
> num_segments,
> local_iov,
> Index: common/dapl_srq_post_recv.c
> ===================================================================
> --- common/dapl_srq_post_recv.c (revision 3892)
> +++ common/dapl_srq_post_recv.c (working copy)
> @@ -79,7 +79,7 @@
> DAT_RETURN dat_status;
>
> dapl_dbg_log (DAPL_DBG_TYPE_API,
> - "dapl_srq_post_recv (%p, %d, %p, %P)\n",
> + "dapl_srq_post_recv (%p, %d, %p, %p)\n",
> srq_handle,
> num_segments,
> local_iov,
> Index: common/dapl_ep_post_recv.c
> ===================================================================
> --- common/dapl_ep_post_recv.c (revision 3892)
> +++ common/dapl_ep_post_recv.c (working copy)
> @@ -79,7 +79,7 @@
> DAT_RETURN dat_status;
>
> dapl_dbg_log (DAPL_DBG_TYPE_API,
> - "dapl_ep_post_recv (%p, %d, %p, %P, %x)\n",
> + "dapl_ep_post_recv (%p, %d, %p, %p, %x)\n",
> ep_handle,
> num_segments,
> local_iov,
>
> Thanks
> Aniruddha
>
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list