[ofa-general] ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407

Pawel Dziekonski dzieko at wcss.pl
Mon Apr 6 03:54:24 PDT 2009


On Thu, 02 Apr 2009 at 08:07:20PM +0200, Bernd Schubert wrote:
> Hello,
> 
> I'm fighting (as usual) with some Lustre problems and I think this time it is 
> IB related. In the logs of some systems I see messages like these:
> 
> ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407
> 
> Anyone knows what is the meaning of that? The kernel modules are from 
> OFED-1.3.1.

Hi Bernd,

we are also using 1.3.1 and Lustre, as you have seen recently at our
site ;-)

I'm getting messages like these only when large computing jobs are
running using IPoIB. I believe that this is a issue with send/receive
buffers, because I see dropped packets on IPoIB iface. Those jobs work
usually fine (usually because this app is buggy itself) so I find
those messages rather harmless.

regards, P


-- 
Pawel Dziekonski <pawel.dziekonski at wcss.pl>
Wroclaw Centre for Networking & Supercomputing, HPC Department
Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND
phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl



More information about the general mailing list