[openib-general] Re: MAD module removal and MAD cache

Hal Rosenstock halr at voltaire.com
Fri May 13 13:41:14 PDT 2005


On Fri, 2005-05-13 at 15:31, Sean Hefty wrote:
> Hal Rosenstock wrote:
> > Hi Sean,
> > 
> > The MAD send code can place a send request on the send list, the QP's
> > send queue, or overflow list. 
> > 
> > On port close, I see the receive queue being cleaned up but not these.
> > 
> > Am I missing something ? If I'm not, then this may be the MAD cache leak
> > which causes a problem when running SM, killing it, and trying to remove
> > the modules.
> 
> When a MAD is sent, a reference is taken on the mad_agent.  Cleanup of these 
> lists should occur automatically as a send operation completes.  We need to 
> be careful about trying to perform cleanup during mad_agent destruction, 
> since the MAD may be posted to the QP.  If it is, we need to wait until we 
> see its completion.
> 
> When a mad_agent is destroyed, all outstanding MADs on its send_list are 
> canceled.  This will complete all MADs waiting for a response, but those on 
> the QP's send_queue or overflow_list still need to wait for a completion.
> 
> If the mad_agent destruction is proceeding, then it sounds like these lists 
> are empty.  To verify, you could add a check that the mad_agent's send_list 
> is empty before deregister returns.

I forgot that was how it worked.. I used to know this.

The completions are not occuring in that case:

May 13 16:17:47 localhost kernel: ib_mad: handle_outgoing_dr_smp: MAD 0xce9a4030 allocated
May 13 16:17:47 localhost kernel: ib_mad: handle_outgoing_dr_smp: MAD 0xce9a41c0 allocated
May 13 16:17:47 localhost kernel: ib_mad: handle_outgoing_dr_smp: MAD 0xce9a4350 allocated
May 13 16:17:47 localhost kernel: ib_mad: handle_outgoing_dr_smp: MAD 0xce9a44e0 allocated
May 13 16:17:47 localhost kernel: ib_mad: handle_outgoing_dr_smp: MAD 0xc2ea5e40 allocated

I think this is because handle_outgoing_dr_smp isn't "playing by the
rules" in ib_post_send_mad so it is not being tracked appropriately :-(
My bad...

-- Hal




More information about the general mailing list