[openib-general] segmentation fault in ibv_modify_srq
Sayantan Sur
surs at cse.ohio-state.edu
Thu Oct 6 06:39:39 PDT 2005
* On Oct,10 Roland Dreier<rolandd at cisco.com> wrote :
> Sayantan> I am getting a segmentation fault after a couple of
> Sayantan> thousand messages are sent over SRQ (using ping-pong
> Sayantan> latency test). Here is a snippet from the core
> Sayantan> generated.
>
> Is it possible that you are posting one more receive to the SRQ than
> the max capacity you requested when creating the SRQ?
>
> What happens with the patch below applied to libmthca?
Upon inspection of my code, I found that there _is_ a possibility of
posting more than srq config. I fixed that and the ping-pong test works.
The patch you sent is good, it prevents the application from posting
more than max.
I will test out the limit event generation next.
Thanks,
Sayantan.
>
> Thanks,
> Roland
>
>
> --- libmthca/src/srq.c (revision 3664)
> +++ libmthca/src/srq.c (working copy)
> @@ -110,6 +110,13 @@ int mthca_tavor_post_srq_recv(struct ibv
>
> wqe = get_wqe(srq, ind);
> next_ind = *wqe_to_link(wqe);
> +
> + if (next_ind < 0) {
> + err = -1;
> + *bad_wr = wr;
> + break;
> + }
> +
> prev_wqe = srq->last;
> srq->last = wqe;
>
> @@ -197,6 +204,12 @@ int mthca_arbel_post_srq_recv(struct ibv
> wqe = get_wqe(srq, ind);
> next_ind = *wqe_to_link(wqe);
>
> + if (next_ind < 0) {
> + err = -1;
> + *bad_wr = wr;
> + break;
> + }
> +
> ((struct mthca_next_seg *) wqe)->nda_op =
> htonl((next_ind << srq->wqe_shift) | 1);
> ((struct mthca_next_seg *) wqe)->ee_nds = 0;
--
http://www.cse.ohio-state.edu/~surs
More information about the general
mailing list