[openfabrics-ewg] OFED 1.1
Michael S. Tsirkin
mst at mellanox.co.il
Wed Sep 20 23:11:02 PDT 2006
Quoting r. Betsy Zeller <betsy at pathscale.com>:
> We've made some adds to bug 233, which should help you in tracking it
> down.
Betsy, seems like you are running the wrong code:
I don't understand what's causing the panic reported by 233:
you report kernel BUG at sdp_bcopy.c line 230, but there's
no BUG_ON at sdp_bcopy.c line 230 when I install -pre2 here.
Here's how it looks:
230: ssk->rx_wr.sg_list = ssk->ibsge;
ssk->rx_wr.num_sge = frags + 1;
rc = ib_post_recv(ssk->qp, &ssk->rx_wr, &bad_wr);
++ssk->rx_head;
if (unlikely(rc)) {
sdp_dbg(&ssk->isk.sk, "ib_post_recv failed with status %d\n",
rc);
sdp_reset(&ssk->isk.sk);
}
We did have a BUG_ON there before pre1 which makes it look to me
like you got the wrong SDP somehow.
I have just verified that the code in RC6 pre2 tarball in SVN matches my copy,
by downloading it from here:
https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-rc6-pre2.tgz
opening up
OFED-1.1-rc6-pre2/SOURCES/openib-1.1.tgz
and looking at openib-1.1/drivers/infiniband/ulp/sdp/sdp_bcopy.c
I'd recomment reboot after swapping pre-release OFED revisions,
to make sure some old module does not stick in memory.
Please comment.
> I'd be very interested to see this fixed for OFED 1.1.
You'll have to debug this - does not happen to me on mthca at all,
and I don't see what it coud be - looks like a low level driver issue
at the moment.
--
MST
More information about the ewg
mailing list