[ofa-general] Another OFED 1.3 XRC bug with 2.6.9 kernel

Tang, Changqing changquing.tang at hp.com
Sun Feb 24 13:51:14 PST 2008


Jack:
        Mellanox installed RC5 on helios.mellanox.com for me, this is a 2.6.9-42 kernel
system. But I still see that when errno is not zero, and I call ibv_modify_xrc_rcv_qp(),
this function fails.

        If I clear errno to zero before I call ibv_modify_xrc_rcv_qp(), everything is fine.

        Can you take a look ?

--CQ


> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Tuesday, February 19, 2008 6:41 AM
> To: general at lists.openfabrics.org
> Cc: Tang, Changqing
> Subject: Re: [ofa-general] Another OFED 1.3 XRC bug with 2.6.9 kernel
>
> On Tuesday 19 February 2008 02:35, Tang, Changqing wrote:
> >
> > I have taken sometime to trace down this bug.
> >
> > When running OFED 1.3 on 2.6.9-42.ELsmp kernel,
> > putenv("IBV_FORK_SAFE=1"); causes ibv_get_device_list() to
> print out a Warning and set errno = 22 :
> >
> > A:errno=0
> > libibverbs: Warning: fork()-safety requested but init failed
> > B:errno=22
> >
> > errno keeps value 22 and causes ibv_modify_xrc_rcv_qp() to fail.
> >
> > Another way to make ibv_modify_xrc_rcv_qp() to fail is to
> set errno =
> > 22 just before calling this function. However, this only happens on
> > 2.6.9-42.ELsmp kernel, on 2.6.18-8.e15 kernel, it succeeds.
> >
> > 2.6.9-42.ELsmp is the kernel in Mellanox testing cluster
> > helios.mellanox.com/ibd001-0032
> >
> > Thanks for Mellanox guys to have a look
> >
> >
> > --CQ
>
> I fixed a bug just like this in OFED 1.3 on Jan 30. The fix
> is in OFED 1.3 RC4 -- are you using that version?  If not,
> please install RC4 and re-test.
>
> (The bug was in kernel space:
>
> ===========
> IB/core: fixed thinko in return values for
> ib_uverbs_xxxx_xrc_rcv_qp() procs.
> Wed, 30 Jan 2008 15:11:08 +0000 (17:11 +0200) commit
> 78273e00083543535edd4c9db830b4ac45eb556a
> IB/core: fixed thinko in return values for
> ib_uverbs_xxxx_xrc_rcv_qp() procs.
>
> Incorrectly returned 0 instead of in_len in several procedures.
> =================
>
> This bug caused userspace to return the "errno" value even
> when the kernel operation completed successfully, which is
> what you seem to be seeing.
>
> - Jack
>
>
>
>
>
>
>



More information about the general mailing list