[ofa-general] IPoIB kernel Oops -- race condition
Jack Morgenstein
jackm at dev.mellanox.co.il
Sun Jun 28 11:18:22 PDT 2009
On Sunday 28 June 2009 19:09, Moni Shoua wrote:
> maybe synchronizing the race with a completion var (like IPoIB does in struct ipoib_path) will help. I think this will work. I can send a patch if you want unless you see this idea doesn't work for this case.
>
> MoniS
I just looked at the ipoib_path struct. I assume that you mean the "valid" field. I do not think this will work here.
We need to wait when doing ifconfig ib0 down before testing the busy flag:
*** We need to wait at this point in ipoib_mcast_leave() before testing FLAG_BUSY ***
set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags);
mcast->mc = ib_sa_join_multicast(&ipoib_sa_client, priv->ca, priv->port,
&rec, comp_mask, GFP_KERNEL,
ipoib_mcast_join_complete, mcast);
if (IS_ERR(mcast->mc)) {
clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags);
*** Need to wait until this point -- the above set_bit/clear_bit pair needs to be atomic
....
} else {
*** Wait until this point in ipoib_mcast_leave() so that the busy flag is set,
and mcast->mc is successfully assigned -- i.e., non-null ****
}
in ipoib_mcast_leave():
*** NEED TO WAIT HERE BEFORE CONTINUING (so that BUSY is cleared (mcast->mc is in error),
*** or BUSY flag is set and mcast->mc is a valid, non-NULL pointer ****
if (test_and_clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
ib_sa_free_multicast(mcast->mc);
- Jack
More information about the general
mailing list