[openib-general] Re: MAD module removal and MAD cache
Sean Hefty
mshefty at ichips.intel.com
Fri May 13 12:31:15 PDT 2005
Hal Rosenstock wrote:
> Hi Sean,
>
> The MAD send code can place a send request on the send list, the QP's
> send queue, or overflow list.
>
> On port close, I see the receive queue being cleaned up but not these.
>
> Am I missing something ? If I'm not, then this may be the MAD cache leak
> which causes a problem when running SM, killing it, and trying to remove
> the modules.
When a MAD is sent, a reference is taken on the mad_agent. Cleanup of these
lists should occur automatically as a send operation completes. We need to
be careful about trying to perform cleanup during mad_agent destruction,
since the MAD may be posted to the QP. If it is, we need to wait until we
see its completion.
When a mad_agent is destroyed, all outstanding MADs on its send_list are
canceled. This will complete all MADs waiting for a response, but those on
the QP's send_queue or overflow_list still need to wait for a completion.
If the mad_agent destruction is proceeding, then it sounds like these lists
are empty. To verify, you could add a check that the mad_agent's send_list
is empty before deregister returns.
- Sean
More information about the general
mailing list