[openib-general] segmentation fault in ibv_modify_srq

Sayantan Sur surs at cse.ohio-state.edu
Thu Oct 6 06:39:39 PDT 2005


* On Oct,10 Roland Dreier<rolandd at cisco.com> wrote :
>     Sayantan> I am getting a segmentation fault after a couple of
>     Sayantan> thousand messages are sent over SRQ (using ping-pong
>     Sayantan> latency test). Here is a snippet from the core
>     Sayantan> generated.
> 
> Is it possible that you are posting one more receive to the SRQ than
> the max capacity you requested when creating the SRQ?
> 
> What happens with the patch below applied to libmthca?

Upon inspection of my code, I found that there _is_ a possibility of
posting more than srq config. I fixed that and the ping-pong test works.

The patch you sent is good, it prevents the application from posting
more than max.

I will test out the limit event generation next.

Thanks,
Sayantan.

> 
> Thanks,
>   Roland
> 
> 
> --- libmthca/src/srq.c	(revision 3664)
> +++ libmthca/src/srq.c	(working copy)
> @@ -110,6 +110,13 @@ int mthca_tavor_post_srq_recv(struct ibv
>  
>  		wqe       = get_wqe(srq, ind);
>  		next_ind  = *wqe_to_link(wqe);
> +
> +		if (next_ind < 0) {
> +			err = -1;
> +			*bad_wr = wr;
> +			break;
> +		}
> +
>  		prev_wqe  = srq->last;
>  		srq->last = wqe;
>  
> @@ -197,6 +204,12 @@ int mthca_arbel_post_srq_recv(struct ibv
>  		wqe       = get_wqe(srq, ind);
>  		next_ind  = *wqe_to_link(wqe);
>  
> +		if (next_ind < 0) {
> +			err = -1;
> +			*bad_wr = wr;
> +			break;
> +		}
> +
>  		((struct mthca_next_seg *) wqe)->nda_op =
>  			htonl((next_ind << srq->wqe_shift) | 1);
>  		((struct mthca_next_seg *) wqe)->ee_nds = 0;

-- 
http://www.cse.ohio-state.edu/~surs



More information about the general mailing list