[openib-general] [PATCH] use mmiowb after doorbell ring
Roland Dreier
rdreier at cisco.com
Mon Oct 16 09:55:11 PDT 2006
akepner> At least with the workload that we used to reproduce this
akepner> bug, yes. (The workload was simply 2 ttcp processes, each
akepner> placed on a different node of an Altix.) Without the
akepner> mmiowb()s things would hang very reliably and very
akepner> quickly (within a second). With the additional mmiowb()
akepner> calls I never observed a problem after 10's of minutes.
OK, cool. Sounds convincing to me. BTW -- are there Altix systems
with PCIe? Have you tested the mthca_arbel_xxx (mem-free PCIe HCA)
changes, or just the mthca_tavor_xxx (PCI-X HCA) parts?
akepner> I wanted to put it in the "if (likely(nreq))" block so
akepner> that we don't do the mmiowb() unless it's really
akepner> necessary. A very minor optimization (but a co-worker
akepner> reports that it does produce a measurable, but small
akepner> performance improvement.)
I see -- the other mmiowb()s are next to the spin_unlock()s elsewhere
because the other routines might ring doorbells during the loop if
someone passes in a ton of work requests, right? (All the mmiowb()s
look necessary to me but I'm just curious about the level of testing)
I'm still a little puzzled by the fact that it affects performance,
because that "if (likely(nreq))" is super-super-likely: under any
normal workload, I would expect it always to be true. It's really
strange that there's any difference between
mmiowb();
qp->sq.next_ind = ind;
qp->sq.head += nreq;
and
qp->sq.next_ind = ind;
qp->sq.head += nreq;
mmiowb();
Anyway, it all looks good. I'll apply your patch and submit it to
-stable for 2.6.18.x.
Thanks,
Roland
More information about the general
mailing list