[openib-general] segmentation fault in ibv_modify_srq

Sayantan Sur surs at cse.ohio-state.edu
Wed Oct 5 19:15:31 PDT 2005


Roland,

* On Oct,7 Roland Dreier<rolandd at cisco.com> wrote :
> OK, I just checked in an initial implementation of both setting the
> SRQ limit with the modify SRQ verb, and also getting SRP limit reached
> events when the occur.  You will need to update your kernel drivers,
> libibverbs and libmthca to get this.
> 
> I've done zero testing, so please let me know how it works.  You
> should at least get an interesting new failure.

I am getting a segmentation fault after a couple of thousand messages
are sent over SRQ (using ping-pong latency test). Here is a snippet from
the core generated.

Let me know what you think about this.

Thanks,
Sayantan.

=============

#0  0x00002aaaab238faa in mthca_poll_cq (ibcq=0xd4b920, ne=1, wc=0x7fffff957f90) at
cq.c:336
336                     wc->wr_id = srq->wrid[wqe_index];
(gdb) bt
#0  0x00002aaaab238faa in mthca_poll_cq (ibcq=0xd4b920, ne=1, wc=0x7fffff957f90) at
cq.c:336
#1  0x00000000004151f5 in MPID_DeviceCheck (blocking=MPID_BLOCKING) at verbs.h:746
#2  0x000000000042101c in MPID_RecvComplete (request=0x7fffff958030,
status=0x7fffff958230, error_code=0x7fffff958184)
    at mpid_recv.c:90
#3  0x000000000041791c in MPID_RecvDatatype (comm_ptr=0xf5e9d0, buf=0x536280, count=2,
dtype_ptr=0xd36f60, src_lrank=0,
    tag=1, context_id=0, status=0x7fffff958230, error_code=0x7fffff958184) at
mpid_hrecv.c:89
#4  0x0000000000402586 in PMPI_Recv (buf=0x536280, count=2, datatype=<value optimized
out>, source=0, tag=1,
    comm=<value optimized out>, status=0x7fffff958230) at recv.c:87
#5  0x00000000004020a9 in main ()
(gdb) f 0
#0  0x00002aaaab238faa in mthca_poll_cq (ibcq=0xd4b920, ne=1, wc=0x7fffff957f90) at
cq.c:336
336                     wc->wr_id = srq->wrid[wqe_index];
(gdb) list
331             } else if ((*cur_qp)->ibv_qp.srq) {
332                     srq = to_msrq((*cur_qp)->ibv_qp.srq);
333                     wqe = htonl(cqe->wqe);
334                     wq = NULL;
335                     wqe_index = wqe >> srq->wqe_shift;
336                     wc->wr_id = srq->wrid[wqe_index];
337                     mthca_free_srq_wqe(srq, wqe);
338             } else {
339                     wq = &(*cur_qp)->rq;
340                     wqe_index = ntohl(cqe->wqe) >> wq->wqe_shift;



> 
>  - R.

-- 
http://www.cse.ohio-state.edu/~surs



More information about the general mailing list