[openib-general] [PATCH (updated)] ipoib: dont lock tx on completion

Michael S. Tsirkin mst at mellanox.co.il
Wed Feb 9 10:41:32 PST 2005


Quoting r. Roland Dreier <roland at topspin.com>:
> Subject: Re: [PATCH] ipoib: dont lock tx on completion
> 
>     Michael> By testing again, and waking the net queue if completions
>     Michael> arrived, we can avoid taking the tx lock on send
>     Michael> completion in ip over ib, reducing the lock contention.
> 
> I really like the clever way of retesting after putting the queue to
> sleep.  However I can't convince myself that some test or other will
> be wrong because of stale values of tx_head or tx_tail, and I'm always
> scared of code that requires multiple barriers and so on.  So I'm
> going to defer applying this until I have time to think hard and make
> sure there aren't any subtle races.
> 
>  - R.
> 

Hmm, stared at it some more and I think I see what you mean.
Here's another attempt:

By testing again, and waking the net queue if completions
arrived, we can avoid taking the tx lock on send
completion in ip over ib, reducing the lock contention.

An exception is when the interface is locked: in that case
we have to look at the head and tail values, so its easier
to take the lock than worry about smp issues.

Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>

Index: ulp/ipoib/ipoib_ib.c
===================================================================
--- ulp/ipoib/ipoib_ib.c	(revision 1747)
+++ ulp/ipoib/ipoib_ib.c	(working copy)
@@ -225,7 +225,6 @@ static void ipoib_ib_handle_wc(struct ne
 
 	} else {
 		struct ipoib_buf *tx_req;
-		unsigned long flags;
 
 		if (wr_id >= IPOIB_TX_RING_SIZE) {
 			ipoib_warn(priv, "completion event with wrid %d (> %d)\n",
@@ -247,12 +246,19 @@ static void ipoib_ib_handle_wc(struct ne
 
 		dev_kfree_skb_any(tx_req->skb);
 
-		spin_lock_irqsave(&priv->tx_lock, flags);
 		++priv->tx_tail;
-		if (netif_queue_stopped(dev) &&
-		    priv->tx_head - priv->tx_tail <= IPOIB_TX_RING_SIZE / 2)
-			netif_wake_queue(dev);
-		spin_unlock_irqrestore(&priv->tx_lock, flags);
+
+		/* Make sure the send routine will see
+		   the tail updated. */
+		wmb();
+
+		if (netif_queue_stopped(dev)) {
+			unsigned long flags;
+			spin_lock_irqsave(&priv->tx_lock, flags);
+			if (priv->tx_head - priv->tx_tail <= IPOIB_TX_RING_SIZE / 2)
+				netif_wake_queue(dev);
+			spin_unlock_irqrestore(&priv->tx_lock, flags);
+		}
 
 		if (wc->status != IB_WC_SUCCESS &&
 		    wc->status != IB_WC_WR_FLUSH_ERR)
@@ -341,6 +347,14 @@ void ipoib_send(struct net_device *dev, 
 		if (priv->tx_head - priv->tx_tail == IPOIB_TX_RING_SIZE) {
 			ipoib_dbg(priv, "TX ring full, stopping kernel net queue\n");
 			netif_stop_queue(dev);
+
+			/* Make sure the comletion routine will see
+			   the interface stopped */
+			wmb();
+
+			/* Meanwhile, a completion may have updated tx_tail. */
+			if (priv->tx_head - priv->tx_tail <= IPOIB_TX_RING_SIZE / 2)
+				netif_wake_queue(dev);
 		}
 	}
 }

-- 
MST - Michael S. Tsirkin



More information about the general mailing list