[ofa-general] Re: the so many IPoIB-UD failures introduced by OFED 1.3
Or Gerlitz
ogerlitz at voltaire.com
Tue May 13 23:23:48 PDT 2008
Roland Dreier wrote:
> > ipoib_cm.c:ipoib_cm_send() does:
> > if (++priv->tx_outstanding == ipoib_sendq_size)
> > netif_stop_queue(dev);
> >
> > but ipoib_ib.c:ipoib_send() does:
> > if (++priv->tx_outstanding == (ipoib_sendq_size - 1)) {
> > netif_stop_queue(dev);
>
> So this is not in the upstream kernel... I wonder if this is a bug
> introduced in an OFED 1.3 patch?
Over the last period we had so much debugging done to non reviewed ipoib
patches which were merged to ofed 1.3 bypassing any sane procedure. This
includes people sending to Roland bugs reports on code he does not see
in his tree, and people reporting on bugs introduced by code pushed to
ofed after rc3!
It seems like we chose a very un efficient way to work : first, merge
code, second, test and see it crashing, third, ask for the maintainer to
review, get him to fix it, forth, push it to the kernel.
ofed 1.3 is out there merged into commercial "enterprise" distros, ipoib
is the first thing people test, so these people would get all these
crashes.
Maybe its about time for the Linux IB maintainers to get a little angry?!
Or.
More information about the general
mailing list