[ofa-general] Another OFED 1.3 XRC bug with 2.6.9 kernel
Tang, Changqing
changquing.tang at hp.com
Tue Feb 19 06:17:33 PST 2008
I was told that helios.mellanox.com/ibd001-0032 was installed with Feb. 11 build.
I just checked that it is actually Jan. 24 build. So I believe it is fixed. I will
ask Mellanox people to update the system.
--CQ
> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Tuesday, February 19, 2008 6:41 AM
> To: general at lists.openfabrics.org
> Cc: Tang, Changqing
> Subject: Re: [ofa-general] Another OFED 1.3 XRC bug with 2.6.9 kernel
>
> On Tuesday 19 February 2008 02:35, Tang, Changqing wrote:
> >
> > I have taken sometime to trace down this bug.
> >
> > When running OFED 1.3 on 2.6.9-42.ELsmp kernel,
> > putenv("IBV_FORK_SAFE=1"); causes ibv_get_device_list() to
> print out a Warning and set errno = 22 :
> >
> > A:errno=0
> > libibverbs: Warning: fork()-safety requested but init failed
> > B:errno=22
> >
> > errno keeps value 22 and causes ibv_modify_xrc_rcv_qp() to fail.
> >
> > Another way to make ibv_modify_xrc_rcv_qp() to fail is to
> set errno =
> > 22 just before calling this function. However, this only happens on
> > 2.6.9-42.ELsmp kernel, on 2.6.18-8.e15 kernel, it succeeds.
> >
> > 2.6.9-42.ELsmp is the kernel in Mellanox testing cluster
> > helios.mellanox.com/ibd001-0032
> >
> > Thanks for Mellanox guys to have a look
> >
> >
> > --CQ
>
> I fixed a bug just like this in OFED 1.3 on Jan 30. The fix
> is in OFED 1.3 RC4 -- are you using that version? If not,
> please install RC4 and re-test.
>
> (The bug was in kernel space:
>
> ===========
> IB/core: fixed thinko in return values for
> ib_uverbs_xxxx_xrc_rcv_qp() procs.
> Wed, 30 Jan 2008 15:11:08 +0000 (17:11 +0200) commit
> 78273e00083543535edd4c9db830b4ac45eb556a
> IB/core: fixed thinko in return values for
> ib_uverbs_xxxx_xrc_rcv_qp() procs.
>
> Incorrectly returned 0 instead of in_len in several procedures.
> =================
>
> This bug caused userspace to return the "errno" value even
> when the kernel operation completed successfully, which is
> what you seem to be seeing.
>
> - Jack
>
>
>
>
>
>
>
More information about the general
mailing list