[openib-general] kdapltest regression? failing now...

James Lentini jlentini at netapp.com
Thu May 19 18:43:44 PDT 2005


I commited a fix for this in revision 2420. The problem turned out to 
be that DAPL wasn't initializing the max_inline_data value of the QP 
attr's cap structure.

Let me know if you still have any problems.

There is a patch in the pipeline that will remove the IBAT printout 
you mentioned.

james

On Thu, 19 May 2005, James Lentini wrote:

>
> I think I figure this out. DAPL was assuming a particular maximum scatter 
> gather list size. I'm going to change it to query for this value. Hopefully 
> I'll have a fix shortly.
>
> james
>
> On Thu, 19 May 2005, James Lentini wrote:
>
>> 
>> For what it's worth, this is the check that we are "failing":
>> 
>> qp->sq.max_gs > dev->limits.max_sg
>> 
>> ( qp->sq.max_gs + 2 > dev->limits.max_sg is also true but
>>  qp->transport == MLX is not).
>> 
>> On Thu, 19 May 2005, James Lentini wrote:
>> 
>>> 
>>> I'm looking into this Tom.
>>> 
>>> The following code was added to hw/mthca/mthca_qp.c on Friday
>>> (starting on line 1233):
>>> 
>>> 
>>> if ((qp->transport == MLX && qp->sq.max_gs + 2 > dev->limits.max_sg) ||
>>>    qp->sq.max_gs > dev->limits.max_sg || qp->rq.max_gs > 
>>> dev->limits.max_sg)
>>>             return -EINVAL;
>>> 
>>> If anyone knows what we have set incorrectly, please let me know.
>>> 
>>> Thanks,
>>> james
>>> 
>>> On Thu, 19 May 2005, Tom Duffy wrote:
>>> 
>>> tduffy> I am not sure when this started, but after updating to top of 
>>> trunk*, I
>>> tduffy> can no longer get kdapltest to work properly.  Both ipoib and sdp 
>>> are
>>> tduffy> working.
>>> tduffy>
>>> tduffy> Both server and client are returning an error: DAT_INVALID_HANDLE. 
>>> This
>>> tduffy> is coming from ib_create_qp().  With debugging turned on:
>>> tduffy>
>>> tduffy> [root at flopteron2 ~]# ./kdapltest -T S -D mthca0a -d
>>> tduffy> kDAPL: dapl_ia_open (mthca0a, 8, ffff81000b806308, 
>>> ffff81000b8062d8)
>>> tduffy> kDAPL: dapl_ia_open () returns 0x0
>>> tduffy> kDAPL: dapl_pz_create (ffff81001ba165c8, ffff81000b8062e0)
>>> tduffy> kDAPL: dapl_evd_kcreate (ffff81001ba165c8, 8, 1, upcall, 0x20, 
>>> ffff81000b8062e8)
>>> tduffy> kDAPL: dapl_evd_kcreate (ffff81001ba165c8, 8, 1, upcall, 0xa0, 
>>> ffff81000b8062f0)
>>> tduffy> kDAPL: dapl_evd_kcreate (ffff81001ba165c8, 8, 1, upcall, 0x10, 
>>> ffff81000b806300)
>>> tduffy> kDAPL: dapl_evd_kcreate (ffff81001ba165c8, 8, 1, upcall, 0x40, 
>>> ffff81000b8062f8)
>>> tduffy> kDAPL: dapl_ep_create (ffff81001ba165c8, ffff81001b9442c8, 
>>> ffff81001ba164b0, ffff81001ba166e0, ffff81001ba22050, 0000000000000000, 
>>> ffff81000b806318)
>>> tduffy> kDAPL:  dapl_ib_qp_alloc: ib_create_qp failed = -22
>>> tduffy> kDAPL: dapl_evd_free (ffff81001ba22050)
>>> tduffy> kDAPL: dapl_evd_free () returns 0x0
>>> tduffy> kDAPL: dapl_evd_free (ffff81001ba22168)
>>> tduffy> kDAPL: dapl_evd_free () returns 0x0
>>> tduffy> kDAPL: dapl_evd_free (ffff81001ba166e0)
>>> tduffy> kDAPL: dapl_evd_free () returns 0x0
>>> tduffy> kDAPL: dapl_evd_free (ffff81001ba164b0)
>>> tduffy> kDAPL: dapl_evd_free () returns 0x0
>>> tduffy> kDAPL: dapl_pz_free (ffff81001b9442c8)
>>> tduffy> kDAPL: dapl_ia_query (ffff81001ba165c8, 0000000000000000, 
>>> 0000000000000000, ffff81001bba7b28)
>>> tduffy> kDAPL: dapl_ia_query () returns 0x0
>>> tduffy> kDAPL: dapl_ia_close (ffff81001ba165c8, 1)
>>> tduffy> kDAPL: dapl_evd_free (ffff81001ba167f8)
>>> tduffy> kDAPL: dapl_evd_free () returns 0x0
>>> tduffy> Server_Cmd.debug:       1
>>> tduffy> Server_Cmd.dapl_name: mthca0a
>>> tduffy> DT_cs_Server: IA mthca0a opened
>>> tduffy> DT_cs_Server: PZ created
>>> tduffy> DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE
>>> tduffy> DT_cs_Server: Waiting for clients to all go away...
>>> tduffy> DT_cs_Server: Cleaning up ...
>>> tduffy> DT_cs_Server: IA mthca0a closed
>>> tduffy> DT_cs_Server (mthca0a):  Exiting.
>>> tduffy> TEST INSTANCE 0
>>> tduffy> TEST return code = 1
>>> tduffy>
>>> tduffy> Also, the ib_at module prints this out now when you ping (after 
>>> running
>>> tduffy> kdapltest)...
>>> tduffy>
>>> tduffy> ib_at: ib_at_arp_work: Process IB ARP ip <192.168.0.26> gid 
>>> <0xfe800000000000000002c9010a99e031>
>>> tduffy>
>>> tduffy> -tduffy
>>> tduffy>
>>> tduffy> * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, 
>>> opensm r2414 2 machines back-2-back
>>> tduffy>
>>> 
>> 
>



More information about the general mailing list