[ofa-general] IPoIB kernel Oops -- race condition

Jack Morgenstein jackm at dev.mellanox.co.il
Tue Jun 30 07:26:48 PDT 2009


On Tuesday 30 June 2009 16:27, Yossi Etigin wrote:
> 
> What do you think about renaming the mcast_task to mcast_join_task 
> and multicast_list to mcast_join_list? It will make the purpose and
> the analogy between the two more obvious.

I'll do that.
> >  static void __exit ipoib_cleanup_module(void)
> >  {
> >  	ib_unregister_client(&ipoib_client);
> > +	flush_workqueue(ipoib_workqueue);
> >  	ib_sa_unregister_client(&ipoib_sa_client);
> >  	ipoib_unregister_debugfs();
> >  	destroy_workqueue(ipoib_workqueue);
> 
> This is already done in ipoib_remove_one().
Correct, I will remove the flush.

> > --- is4_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c	2009-06-30 09:44:48.955963000 +0300
> > +++ is4_kernel/drivers/infiniband/ulp/ipoib/ipoib_multicast.c	2009-06-30 09:45:44.990941000 +0300
> > @@ -657,6 +657,31 @@ static int ipoib_mcast_leave(struct net_
> >  	return 0;
> >  }
> >  
> > +void ipoib_mcast_remove_task(struct work_struct *work)
> > +{
> > +	struct ipoib_dev_priv *priv =
> > +		container_of(work, struct ipoib_dev_priv, mcast_remove_task);
> > +	struct net_device *dev = priv->dev;
> > +
> > +	LIST_HEAD(remove_list);
> > +	struct ipoib_mcast *mcast, *tmcast;
> > +	unsigned long flags;
> > +
> > +	ipoib_dbg_mcast(priv, "flushing multicast list\n");
> > +
> > +	spin_lock_irqsave(&priv->lock, flags);
> > +	list_for_each_entry_safe(mcast, tmcast, &priv->mcast_remove_list, list) {
> > +		list_del(&mcast->list);
> > +		list_add_tail(&mcast->list, &remove_list);
> > +	}
> > +	spin_unlock_irqrestore(&priv->lock, flags);
> 
> I think you could use list_splice() here.
I'll look into that.
 
> > @@ -864,11 +884,7 @@ void ipoib_mcast_restart_task(struct wor
> >  	netif_addr_unlock(dev);
> >  	local_irq_restore(flags);
> >  
> > -	/* We have to cancel outside of the spinlock */
> > -	list_for_each_entry_safe(mcast, tmcast, &remove_list, list) {
> > -		ipoib_mcast_leave(mcast->dev, mcast);
> > -		ipoib_mcast_free(mcast);
> > -	}
> > +	queue_work(ipoib_workqueue, &priv->mcast_remove_task);
> >  
> >  	if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> >  		ipoib_mcast_start_thread(dev);
> 
> Modifying ipoib_mcast_restart_task() is not really needed since it runs on the WQ
> anyway.
You're right, I was just working on auto-pilot here and automatically made the change.
I will eliminate the change in the restart_task, because there is no reason to complicate
the flow unnecessarily.

> On the other hand, for compatibility (?) with other changes it may be a good 
> thing to do.
> 
> --Yossi
> 



More information about the general mailing list