[ofa-general] Re: the so many IPoIB-UD failures introduced by OFED 1.3

Or Gerlitz ogerlitz at voltaire.com
Tue May 13 23:23:48 PDT 2008


Roland Dreier wrote:
>  > ipoib_cm.c:ipoib_cm_send() does:
>  >         if (++priv->tx_outstanding == ipoib_sendq_size)
>  >                 netif_stop_queue(dev);
>  > 
>  > but ipoib_ib.c:ipoib_send() does:
>  >         if (++priv->tx_outstanding == (ipoib_sendq_size - 1)) {
>  >                 netif_stop_queue(dev);
>
> So this is not in the upstream kernel... I wonder if this is a bug
> introduced in an OFED 1.3 patch?
Over the last period we had so much debugging done to non reviewed ipoib 
patches which were merged to ofed 1.3 bypassing any sane procedure. This 
includes people sending to Roland bugs reports on code he does not see 
in his tree, and people reporting on bugs introduced by code pushed to 
ofed after rc3!

It seems like we chose a very un efficient way to work : first, merge 
code, second, test and see it crashing, third, ask for the maintainer to 
review, get him to fix it, forth, push it to the kernel.

ofed 1.3 is out there merged into commercial "enterprise" distros, ipoib 
is the first thing people test, so these people would get all these 
crashes.

Maybe its about time for the Linux IB maintainers to get a little angry?!

Or.






More information about the general mailing list