[openib-general] segmentation fault in ibv_modify_srq
Sayantan Sur
surs at cse.ohio-state.edu
Wed Oct 5 19:15:31 PDT 2005
Roland,
* On Oct,7 Roland Dreier<rolandd at cisco.com> wrote :
> OK, I just checked in an initial implementation of both setting the
> SRQ limit with the modify SRQ verb, and also getting SRP limit reached
> events when the occur. You will need to update your kernel drivers,
> libibverbs and libmthca to get this.
>
> I've done zero testing, so please let me know how it works. You
> should at least get an interesting new failure.
I am getting a segmentation fault after a couple of thousand messages
are sent over SRQ (using ping-pong latency test). Here is a snippet from
the core generated.
Let me know what you think about this.
Thanks,
Sayantan.
=============
#0 0x00002aaaab238faa in mthca_poll_cq (ibcq=0xd4b920, ne=1, wc=0x7fffff957f90) at
cq.c:336
336 wc->wr_id = srq->wrid[wqe_index];
(gdb) bt
#0 0x00002aaaab238faa in mthca_poll_cq (ibcq=0xd4b920, ne=1, wc=0x7fffff957f90) at
cq.c:336
#1 0x00000000004151f5 in MPID_DeviceCheck (blocking=MPID_BLOCKING) at verbs.h:746
#2 0x000000000042101c in MPID_RecvComplete (request=0x7fffff958030,
status=0x7fffff958230, error_code=0x7fffff958184)
at mpid_recv.c:90
#3 0x000000000041791c in MPID_RecvDatatype (comm_ptr=0xf5e9d0, buf=0x536280, count=2,
dtype_ptr=0xd36f60, src_lrank=0,
tag=1, context_id=0, status=0x7fffff958230, error_code=0x7fffff958184) at
mpid_hrecv.c:89
#4 0x0000000000402586 in PMPI_Recv (buf=0x536280, count=2, datatype=<value optimized
out>, source=0, tag=1,
comm=<value optimized out>, status=0x7fffff958230) at recv.c:87
#5 0x00000000004020a9 in main ()
(gdb) f 0
#0 0x00002aaaab238faa in mthca_poll_cq (ibcq=0xd4b920, ne=1, wc=0x7fffff957f90) at
cq.c:336
336 wc->wr_id = srq->wrid[wqe_index];
(gdb) list
331 } else if ((*cur_qp)->ibv_qp.srq) {
332 srq = to_msrq((*cur_qp)->ibv_qp.srq);
333 wqe = htonl(cqe->wqe);
334 wq = NULL;
335 wqe_index = wqe >> srq->wqe_shift;
336 wc->wr_id = srq->wrid[wqe_index];
337 mthca_free_srq_wqe(srq, wqe);
338 } else {
339 wq = &(*cur_qp)->rq;
340 wqe_index = ntohl(cqe->wqe) >> wq->wqe_shift;
>
> - R.
--
http://www.cse.ohio-state.edu/~surs
More information about the general
mailing list