[ofa-general] ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407

Bernd Schubert bs_lists at aakef.fastmail.fm
Sat Apr 11 13:05:11 PDT 2009


Hello Pawel,

sorry for my late reply.

On Monday 06 April 2009, Pawel Dziekonski wrote:
> On Thu, 02 Apr 2009 at 08:07:20PM +0200, Bernd Schubert wrote:
> > Hello,
> >
> > I'm fighting (as usual) with some Lustre problems and I think this time
> > it is IB related. In the logs of some systems I see messages like these:
> >
> > ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407
> >
> > Anyone knows what is the meaning of that? The kernel modules are from
> > OFED-1.3.1.
>
> Hi Bernd,
>
> we are also using 1.3.1 and Lustre, as you have seen recently at our
> site ;-)
>
> I'm getting messages like these only when large computing jobs are
> running using IPoIB. I believe that this is a issue with send/receive
> buffers, because I see dropped packets on IPoIB iface. Those jobs work
> usually fine (usually because this app is buggy itself) so I find
> those messages rather harmless.

just out of interest, which applications are using IPoIB?


Cheers,
Bernd



More information about the general mailing list