[openib-general] [PATCH] use mmiowb after doorbell ring
Roland Dreier
rdreier at cisco.com
Sun Oct 15 08:48:21 PDT 2006
> Apparently this is because writes to the doorbells from
> different CPUs are clobbering one another. The following
> patch adds mmiowb() calls after doorbell rings to ensure
> the doorbell register updates are ordered.
Makes sense. I was wondering if there would be any problems like this
after John's message...
> We discovered a problem when running IPoIB applications on
> multiple CPUs on an Altix system. Many messages such as:
>
> ib_mthca 0002:01:00.0: SQ 000014 full (19941644 head, 19941707 tail, 64 max, 0 nreq)
>
> appear in syslog, and the driver wedges up.
However, this is a somewhat weird symptom, although I can imagine that
out-of-order doorbells cause extra completions or something like that,
which causes IPoIB to overrun the send queue.
Adding the mmiowb()s definitely fixes things?
> Signed-off-by: <akepner at sgi.com>
Should this be
Signed-off-by: Arthur Kepner <akepner at sgi.com>
actually? (I just looked through the kernel git log to guess your name)
> @@ -1730,6 +1732,9 @@ out:
> mthca_write64(doorbell,
> dev->kar + MTHCA_SEND_DOORBELL,
> MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock));
> + /* use mmiowb to ensure write to doorbell is ordered
> + * before releasing spinlock */
> + mmiowb();
> }
>
> qp->sq.next_ind = ind;
Any reason why this mmiowb() is placed slightly differently from the
others (which are right before the spin_unlock)?
Thanks,
Roland
More information about the general
mailing list