[openib-general] Kernel Oops related to IPoIB (multicast module?)

Sean Hefty mshefty at ichips.intel.com
Tue Jun 27 12:36:30 PDT 2006


Jack Morgenstein wrote:
> Evidently, ipoib was still attempting to connect with an SA, when the ipoib
> module was unloaded (modprobe -r). After the ipoib module was unloaded (or at
> least rendered inaccessible), the ib_sa module attempted to invoke 
> "ib_sa_mcmember_rec_callback" (for a callback address that was part of the
> unloaded ipoib module).  Hence, the Oops below.

I still haven't been able to reproduce this, but I _think_ I understand what's 
likely happening.

The SA query interface always invokes a callback, regardless if a call succeeds. 
  So if a call to ib_sa_mcmmember_rec_set() fails (which happens in this case 
because the SM is down), the user's callback is still invoked.  The multicast 
module is coded assuming that an immediate failure does not result in a 
callback, so the callback is unexpected, which throws off the reference counting.

I should have a patch for this shortly, but since I can't reproduce the problem, 
my testing of it will be limited.

- Sean




More information about the general mailing list