[ofa-general] [Bug 418] IPoIB CM causes large message IPv4 multicast to fail (was OFED 1.2 beta blocking bugs)
Michael S. Tsirkin
mst at mellanox.co.il
Mon Mar 12 01:55:15 PDT 2007
> Quoting Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [ofa-general] bug 418: was OFED 1.2 beta blocking bugs
>
> > ib0: failed send event (status=1, wrid=35 vend_err 69)
>
> I believe that this is causing the QP to transition into the error state.
>
> > ib_mthca 0000:08:00.0: modify QP 3->3 returned status 10.
>
> The mthca status of 0x10 indicates a bad QP state. The transition from 3->3 is
> RTS to RTS, but the QP is not in the RTS state, which makes sense given the
> previous error. The other receive side errors in the bug report are a fallout
> from not recovering from the send error.
Errors on UD QP typically indicates a software problem.
It seems we are posting packets that exceed the MTU size.
But I do not see this problem here at the lab.
How to reproduce this problem?
> I don't know if this causes any problems, but at first glance it appears that
> the IPoIB CM code begins listening for connection requests before the code has
> had a chance to join the IPoIB broadcast group. This allows a connection to
> form before the broadcast traffic is ready. Someone more familiar with the code
> than I am will need to determine if this can lead to any undesirable race
> conditions.
I don't see why is this a problem - I don't need to be a member of a broadcast group
to get incoming packets.
--
MST
More information about the general
mailing list