[openib-general] probable reference count bug in core/mad.c

Ralph Campbell ralphc at pathscale.com
Tue Jan 10 12:33:16 PST 2006


I have been looking at the code for core/mad.c and in timeout_sends(),
the mad_send_wr is removed from the list of pending sends and
then retry_send() is called.  In retry_send(), if the MAD is resent,
mad_send_wr->refcount is incremented and the WR is put pack on
the list of pending sends.

This seems wrong to me. Either there should be no increment, or
there should be a decrement when the WR is removed from the list.
Also, I think there may be a dependency on whether
mad_send_wr->timeout is zero or not.

Someone who knows this code better may want to check this out.

BTW, I also don't particularly like mad_send_wr->retries
being an int instead of unsigned int and the statement
in retry_send():
	if (!mad_send_wr->retries--)

which could end up resending the MAD 2^32-1 times if requeued.

-- 
Ralph Campbell <ralphc at pathscale.com>




More information about the general mailing list