[ofa-general] [RFC] ipoib: avoid using stale ipoib_neigh* in ipoib_neigh_cleanup()

Or Gerlitz ogerlitz at Voltaire.com
Tue Jun 2 23:46:35 PDT 2009


Roland Dreier wrote:
> I think it would be better to understand what ipoib_neigh_cleanup() is
> racing with.  eg if CM TX completion handling is the problem, then maybe
> the right fix is to take a reference on the neigh when establishing a
> connection and drop the ref when closing the connection.

or avoid accessing the neighbour in the cm tx completion... looking on the code I don't really see what neighbour access from the CQ handler buys the CM design, anyone has an idea? also, Arthur, again, does the race happen under datagram mode?

Or.


   797		if (wc->status != IB_WC_SUCCESS &&
   798		    wc->status != IB_WC_WR_FLUSH_ERR) {
   799			struct ipoib_neigh *neigh;
   800	
   801			ipoib_dbg(priv, "failed cm send event "
   802				   "(status=%d, wrid=%d vend_err %x)\n",
   803				   wc->status, wr_id, wc->vendor_err);
   804	
   805			spin_lock_irqsave(&priv->lock, flags);
   806			neigh = tx->neigh;
   807	
   808			if (neigh) {
   809				neigh->cm = NULL;
   810				list_del(&neigh->list);
   811				if (neigh->ah)
   812					ipoib_put_ah(neigh->ah);
   813				ipoib_neigh_free(dev, neigh);
   814	
   815				tx->neigh = NULL;
   816			}
   817	
   818			if (test_and_clear_bit(IPOIB_FLAG_INITIALIZED, &tx->flags)) {
   819				list_move(&tx->list, &priv->cm.reap_list);
   820				queue_work(ipoib_workqueue, &priv->cm.reap_task);
   821			}
   822	
   823			clear_bit(IPOIB_FLAG_OPER_UP, &tx->flags);
   824	
   825			spin_unlock_irqrestore(&priv->lock, flags);
   826		}



More information about the general mailing list