[openfabrics-ewg] OFED 1.1

Michael S. Tsirkin mst at mellanox.co.il
Wed Sep 20 23:11:02 PDT 2006


Quoting r. Betsy Zeller <betsy at pathscale.com>:
> We've made some adds to bug 233, which should help you in tracking it
> down.

Betsy, seems like you are running the wrong code:
I don't understand what's causing the panic reported by 233:
you report kernel BUG at sdp_bcopy.c line 230, but there's
no BUG_ON at sdp_bcopy.c line 230 when I install -pre2 here.

Here's how it looks:

230:    ssk->rx_wr.sg_list = ssk->ibsge;
        ssk->rx_wr.num_sge = frags + 1;
        rc = ib_post_recv(ssk->qp, &ssk->rx_wr, &bad_wr);
        ++ssk->rx_head;
        if (unlikely(rc)) {
                sdp_dbg(&ssk->isk.sk, "ib_post_recv failed with status %d\n",
rc);
                sdp_reset(&ssk->isk.sk);
        }


We did have a BUG_ON there before pre1 which makes it look to me
like you got the wrong SDP somehow.

I have just verified that the code in RC6 pre2 tarball in SVN matches my copy,
by downloading it from here:
https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-rc6-pre2.tgz

opening up
OFED-1.1-rc6-pre2/SOURCES/openib-1.1.tgz

and looking at openib-1.1/drivers/infiniband/ulp/sdp/sdp_bcopy.c

I'd recomment reboot after swapping pre-release OFED revisions,
to make sure some old module does not stick in memory.
Please comment.

> I'd be very interested to see this fixed for OFED 1.1.

You'll have to debug this - does not happen to me on mthca at all,
and I don't see what it coud be - looks like a low level driver issue
at the moment.

-- 
MST




More information about the ewg mailing list