[openib-general] Re: MAD module removal and MAD cache

Sean Hefty mshefty at ichips.intel.com
Fri May 13 12:31:15 PDT 2005


Hal Rosenstock wrote:
> Hi Sean,
> 
> The MAD send code can place a send request on the send list, the QP's
> send queue, or overflow list. 
> 
> On port close, I see the receive queue being cleaned up but not these.
> 
> Am I missing something ? If I'm not, then this may be the MAD cache leak
> which causes a problem when running SM, killing it, and trying to remove
> the modules.

When a MAD is sent, a reference is taken on the mad_agent.  Cleanup of these 
lists should occur automatically as a send operation completes.  We need to 
be careful about trying to perform cleanup during mad_agent destruction, 
since the MAD may be posted to the QP.  If it is, we need to wait until we 
see its completion.

When a mad_agent is destroyed, all outstanding MADs on its send_list are 
canceled.  This will complete all MADs waiting for a response, but those on 
the QP's send_queue or overflow_list still need to wait for a completion.

If the mad_agent destruction is proceeding, then it sounds like these lists 
are empty.  To verify, you could add a check that the mad_agent's send_list 
is empty before deregister returns.

- Sean



More information about the general mailing list