[ofa-general] ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407
Pawel Dziekonski
dzieko at wcss.pl
Mon Apr 6 03:54:24 PDT 2009
On Thu, 02 Apr 2009 at 08:07:20PM +0200, Bernd Schubert wrote:
> Hello,
>
> I'm fighting (as usual) with some Lustre problems and I think this time it is
> IB related. In the logs of some systems I see messages like these:
>
> ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407
>
> Anyone knows what is the meaning of that? The kernel modules are from
> OFED-1.3.1.
Hi Bernd,
we are also using 1.3.1 and Lustre, as you have seen recently at our
site ;-)
I'm getting messages like these only when large computing jobs are
running using IPoIB. I believe that this is a issue with send/receive
buffers, because I see dropped packets on IPoIB iface. Those jobs work
usually fine (usually because this app is buggy itself) so I find
those messages rather harmless.
regards, P
--
Pawel Dziekonski <pawel.dziekonski at wcss.pl>
Wroclaw Centre for Networking & Supercomputing, HPC Department
Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND
phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl
More information about the general
mailing list