[openib-general] probable reference count bug in core/mad.c
Ralph Campbell
ralphc at pathscale.com
Tue Jan 10 12:33:16 PST 2006
I have been looking at the code for core/mad.c and in timeout_sends(),
the mad_send_wr is removed from the list of pending sends and
then retry_send() is called. In retry_send(), if the MAD is resent,
mad_send_wr->refcount is incremented and the WR is put pack on
the list of pending sends.
This seems wrong to me. Either there should be no increment, or
there should be a decrement when the WR is removed from the list.
Also, I think there may be a dependency on whether
mad_send_wr->timeout is zero or not.
Someone who knows this code better may want to check this out.
BTW, I also don't particularly like mad_send_wr->retries
being an int instead of unsigned int and the statement
in retry_send():
if (!mad_send_wr->retries--)
which could end up resending the MAD 2^32-1 times if requeued.
--
Ralph Campbell <ralphc at pathscale.com>
More information about the general
mailing list