[openib-general] [PATCH] use mmiowb after doorbell ring

Roland Dreier rdreier at cisco.com
Sun Oct 15 08:48:21 PDT 2006


 > Apparently this is because writes to the doorbells from
 > different CPUs are clobbering one another. The following
 > patch adds mmiowb() calls after doorbell rings to ensure
 > the doorbell register updates are ordered.

Makes sense.  I was wondering if there would be any problems like this
after John's message...

 > We discovered a problem when running IPoIB applications on
 > multiple CPUs on an Altix system. Many messages such as:
 > 
 > ib_mthca 0002:01:00.0: SQ 000014 full (19941644 head, 19941707 tail, 64 max, 0 nreq)
 > 
 > appear in syslog, and the driver wedges up.

However, this is a somewhat weird symptom, although I can imagine that
out-of-order doorbells cause extra completions or something like that,
which causes IPoIB to overrun the send queue.

Adding the mmiowb()s definitely fixes things?

 > Signed-off-by: <akepner at sgi.com>

Should this be

Signed-off-by: Arthur Kepner <akepner at sgi.com>

actually?  (I just looked through the kernel git log to guess your name)

 > @@ -1730,6 +1732,9 @@ out:
 >   		mthca_write64(doorbell,
 >   			      dev->kar + MTHCA_SEND_DOORBELL,
 >   			      MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock));
 > +		/* use mmiowb to ensure write to doorbell is ordered 
 > +		 * before releasing spinlock */
 > +		mmiowb();
 >   	}
 > 
 >   	qp->sq.next_ind = ind;

Any reason why this mmiowb() is placed slightly differently from the
others (which are right before the spin_unlock)?

Thanks,
  Roland




More information about the general mailing list