[ofa-general] crash in ipoib

Sean Hefty mshefty at ichips.intel.com
Thu Jun 14 10:47:08 PDT 2007


> Where does :ib_ipoib:ipoib_cm_handle_rx_wc+378 point to on your system?

It points to list_move below:

	if (!likely(wr_id & IPOIB_CM_RX_UPDATE_MASK)) {
		p = wc->qp->qp_context;
		if (p && time_after_eq(jiffies, p->jiffies +
					IPOIB_CM_RX_UPDATE_TIME)) {
			spin_lock_irqsave(&priv->lock, flags);
			p->jiffies = jiffies;
			/* Move this entry to list head, but do not
			   re-add it
			 * if it has been moved out of list. */
			if (p->state == IPOIB_CM_RX_LIVE)
 >>>				list_move(&p->list,
					  priv->cm.passive_ids);
			spin_unlock_irqrestore(&priv->lock, flags);
		}
	}

There appears to be a race in ipoib_cm_req_handler() setting the 
ipoib_cm_rx state outside of a lock, and before the item it added to a 
list.  I think this could cause list_move() call above to oops.  I think 
ipoib_cm_req_handler() needs changes, but I'm not sure if this is enough 
(patch below has line wrap issues...):

@@ -291,16 +291,16 @@ static int ipoib_cm_req_handler(struct ib_cm_id 
*cm_id, st
         if (ret)
                 goto err_modify;

+       cm_id->context = p;
         ret = ipoib_cm_send_rep(dev, cm_id, p->qp,
				&event->param.req_rcvd, psn);
         if (ret) {
                 ipoib_warn(priv, "failed to send REP: %d\n", ret);
                 goto err_rep;
         }

-       cm_id->context = p;
         p->jiffies = jiffies;
-       p->state = IPOIB_CM_RX_LIVE;
         spin_lock_irq(&priv->lock);
+       p->state = IPOIB_CM_RX_LIVE;
         if (list_empty(&priv->cm.passive_ids))
                 queue_delayed_work(ipoib_workqueue,
                                    &priv->cm.stale_task,
				   IPOIB_CM_RX_DELAY);

- Sean



More information about the general mailing list