[ofw][mlx4] SRQ attached QP issue

Smith, Stan stan.smith at intel.com
Mon Oct 6 14:21:00 PDT 2008


Alex Estrin wrote:
> Hi Stan,
>
> SRQ patch for mlx4 driver does not introduces a new feature, just
> fixes a bug.
>
> Thanks,
> Alex.

OK, thanks for the clarification. Please commit to the WOF2-0 branch.

>
>> -----Original Message-----
>> From: Smith, Stan [mailto:stan.smith at intel.com]
>> Sent: Sunday, October 05, 2008 5:41 PM
>> To: Leonid Keller; Alex Estrin
>> Cc: ofw at lists.openfabrics.org
>> Subject: RE: [ofw][mlx4] SRQ attached QP issue
>>
>> Leonid Keller wrote:
>>>  Hi Stan,
>>> What is the policy wrt RC2 branch ?
>>> Do we have to add new patches to it or just to the trunk ?
>>
>> Hello,
>>   Thanks for asking. The WOF2-0 'branch' @
>> gen1\branches\WOF2-0\ is functionality frozen and open only for bug
>> fixes.
>>
>> Stan.
>>>
>>>> -----Original Message-----
>>>> From: Alex Estrin [mailto:alex.estrin at qlogic.com]
>>>> Sent: Thursday, October 02, 2008 3:33 PM
>>>> To: Leonid Keller
>>>> Cc: ofw at lists.openfabrics.org
>>>> Subject: RE: [ofw][mlx4] SRQ attached QP issue
>>>>
>>>> Do you think it makes sense to propagate the fix to 2.0 branch?
>>>> Thanks, Alex.
>>>>
>>>>> -----Original Message-----
>>>>> From: Leonid Keller [mailto:leonid at mellanox.co.il]
>>>>> Sent: Thursday, October 02, 2008 5:27 AM
>>>>> To: Alex Estrin
>>>>> Cc: ofw at lists.openfabrics.org
>>>>> Subject: RE: [ofw][mlx4] SRQ attached QP issue
>>>>>
>>>>> Right.
>>>>> Returned to your variant.
>>>>> Fixed in 1627, thank you.
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Alex Estrin [mailto:alex.estrin at qlogic.com]
>>>>>> Sent: Monday, September 29, 2008 3:01 PM
>>>>>> To: Leonid Keller
>>>>>> Cc: ofw at lists.openfabrics.org
>>>>>> Subject: RE: [ofw][mlx4] SRQ attached QP issue
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Your version of the patch won't work because for the SRQ case
>>>>>> this will always be 'TRUE':
>>>>>>> -               if (!qp->sq.wrid || !qp->rq.wrid) {
>>>>>>> -                       err = -ENOMEM;
>>>>>>> -                       goto err_wrid;
>>>>>>
>>>>>> Thanks,
>>>>>> Alex.
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Leonid Keller [mailto:leonid at mellanox.co.il]
>>>>>>> Sent: Sunday, September 28, 2008 11:13 AM
>>>>>>> To: Alex Estrin
>>>>>>> Cc: ofw at lists.openfabrics.org
>>>>>>> Subject: RE: [ofw][mlx4] SRQ attached QP issue
>>>>>>>
>>>>>>> Applied with some changes in 1625, thank you.
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Alex Estrin [mailto:alex.estrin at qlogic.com]
>>>>>>>> Sent: Saturday, September 27, 2008 1:20 AM
>>>>>>>> To: Leonid Keller
>>>>>>>> Cc: ofw at lists.openfabrics.org
>>>>>>>> Subject: [ofw][mlx4] SRQ attached QP issue
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> This patch seem fixed the create-QP-attached-to-SRQ issue.
>>>>>>>> I didn't verify though, if code allowes send queue zero size.
>>>>>>>> Included check anyway. Please review.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Alex.
>>>>>>>>
>>>>>>>> Index: mlx4/kernel/bus/ib/qp.c
>>>>>>>>
>>>>>>
>>>> ===================================================================
>>>>>>>> --- mlx4/kernel/bus/ib/qp.c   (revision 1622)
>>>>>>>> +++ mlx4/kernel/bus/ib/qp.c   (working copy)
>>>>>>>> @@ -411,14 +411,20 @@
>>>>>>>>               err = mlx4_buf_write_mtt(dev->dev, &qp->mtt,
>>>>>>>>                       &qp->buf);               if (err) goto
>>>>>>>> err_mtt; -
>>>>>>>> -             qp->sq.wrid  = kmalloc(qp->sq.wqe_cnt * sizeof
>>>>>>>> (u64), GFP_KERNEL);
>>>>>>>> -             qp->rq.wrid  = kmalloc(qp->rq.wqe_cnt * sizeof
>>>>>>>> (u64), GFP_KERNEL); -
>>>>>>>> -             if (!qp->sq.wrid || !qp->rq.wrid) {
>>>>>>>> -                     err = -ENOMEM;
>>>>>>>> -                     goto err_wrid;
>>>>>>>> +             if (qp->sq.wqe_cnt) {
>>>>>>>> +                     qp->sq.wrid  =
>>>>> kmalloc(qp->sq.wqe_cnt * sizeof
>>>>>>>> (u64), GFP_KERNEL);
>>>>>>>> +                     if (!qp->sq.wrid) {
>>>>>>>> +                             err = -ENOMEM;
>>>>>>>> +                             goto err_wrid;
>>>>>>>> +                     }
>>>>>>>>               }
>>>>>>>> +             if (qp->rq.wqe_cnt) {
>>>>>>>> +                     qp->rq.wrid  =
>>>>> kmalloc(qp->rq.wqe_cnt * sizeof
>>>>>>>> (u64), GFP_KERNEL);
>>>>>>>> +                     if (!qp->rq.wrid) {
>>>>>>>> +                             err = -ENOMEM;
>>>>>>>> +                             goto err_wrid;
>>>>>>>> +                     }
>>>>>>>> +             }
>>>>>>>>       }
>>>>>>>>
>>>>>>>>       if (!sqpn)
>>>>>>>> @@ -452,8 +458,10 @@
>>>>>>>>
>>>>> mlx4_ib_db_unmap_user(to_mucontext(pd->p_uctx),
>>>>>>>>                                             &qp->db);       }
>>>>>>>> else {
>>>>>>>> -             kfree(qp->sq.wrid);
>>>>>>>> -             kfree(qp->rq.wrid);
>>>>>>>> +             if (qp->sq.wrid)
>>>>>>>> +                     kfree(qp->sq.wrid);
>>>>>>>> +             if (qp->rq.wrid)
>>>>>>>> +                     kfree(qp->rq.wrid);
>>>>>>>>       }
>>>>>>>>
>>>>>>>>  err_mtt:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> It seem there is a controversy exist in mlx4 driver that won't
>>>>>>>>> allow to create QP if it was attached to SRQ.
>>>>>>>>>
>>>>>>>>> At first it doesn't allow to set any recv queue size (which
>>>>>>>>> is fine)
>>>>>>>>>
>>>>>>>>> mlx4\kernel\bus\ib\qp.c@ line 219:
>>>>>>>>>             if (cap->max_recv_wr)
>>>>>>>>>                     return -EINVAL;
>>>>>>>>>
>>>>>>>>>             qp->rq.wqe_cnt = qp->rq.max_gs = 0;
>>>>>>>>>
>>>>>>>>> But then there is assertion for zero size memory request in
>>>>>>>>> 'kmalloc': mlx4\kernel\bus\ib\qp.c @ line 416:
>>>>>>>>>
>>>>>>>>>             qp->rq.wrid  = kmalloc(qp->rq.wqe_cnt * sizeof
>>>>>>>>> (u64), GFP_KERNEL);
>>>>>>>>>
>>>>>>>>> mlx4\kernel\inc\l2w_memory.h @ line 81:
>>>>>>>>>
>>>>>>>>>             static inline void * kmalloc( SIZE_T bsize, gfp_t
>>>>>>>>>                     gfp_mask)             { void *ptr;
>>>>>>>>>                     ASSERT( KeGetCurrentIrql() <=
>>>>>>>>>                     DISPATCH_LEVEL); ASSERT(bsize);
>>>>>>>>> .....
>>>>>>>>>
>>>>>>>>> I believe it was added for a reason.
>>>>>>>>> Please let me know if I missed anything.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Alex.




More information about the ofw mailing list