[openib-general] crash in ib_sa_mcmember_rec_callback while probing out ib_sa

Sean Hefty mshefty at ichips.intel.com
Wed Jun 7 15:21:27 PDT 2006


Roland Dreier wrote:
> Looks like the same crash mst saw related to the multicast module
> being unloaded and then having sa call back into it.  One small clue:
> 
>  > esi: f38a5bec   edi: f38a5bf4   ebp: fffffffc   esp: f599be60
> 
> ebp is -4, which is -EINTR.  So this may be a callback from sa_query's
> send_handler() caused by a IB_WC_WR_FLUSH_ERR status.

This makes sense given the call trace.  When ib_sa is unloading, it unregisters 
its mad_agent, which results in canceling all outstanding MADs.

What doesn't make sense to me is how ib_multicast could have unloaded while 
there are any outstanding SA queries.  All queries hold a reference on a MC 
group until they complete.  And all groups reference a port.  The module 
shouldn't unload until all references are released on all ports.

I removed some code that is intended to speed up cleanup, but is unnecessary. 
We can see if that helps, but I'm skeptical.

- Sean




More information about the general mailing list