[openib-general] [PATCH] use mmiowb after doorbell ring

Roland Dreier rdreier at cisco.com
Mon Oct 16 09:55:11 PDT 2006


    akepner> At least with the workload that we used to reproduce this
    akepner> bug, yes. (The workload was simply 2 ttcp processes, each
    akepner> placed on a different node of an Altix.) Without the
    akepner> mmiowb()s things would hang very reliably and very
    akepner> quickly (within a second).  With the additional mmiowb()
    akepner> calls I never observed a problem after 10's of minutes.

OK, cool.  Sounds convincing to me.  BTW -- are there Altix systems
with PCIe?  Have you tested the mthca_arbel_xxx (mem-free PCIe HCA)
changes, or just the mthca_tavor_xxx (PCI-X HCA) parts?

    akepner> I wanted to put it in the "if (likely(nreq))" block so
    akepner> that we don't do the mmiowb() unless it's really
    akepner> necessary. A very minor optimization (but a co-worker
    akepner> reports that it does produce a measurable, but small
    akepner> performance improvement.)

I see -- the other mmiowb()s are next to the spin_unlock()s elsewhere
because the other routines might ring doorbells during the loop if
someone passes in a ton of work requests, right?  (All the mmiowb()s
look necessary to me but I'm just curious about the level of testing)

I'm still a little puzzled by the fact that it affects performance,
because that "if (likely(nreq))" is super-super-likely: under any
normal workload, I would expect it always to be true.  It's really
strange that there's any difference between

	mmiowb();
	qp->sq.next_ind = ind;
	qp->sq.head    += nreq;

and

	qp->sq.next_ind = ind;
	qp->sq.head    += nreq;
	mmiowb();

Anyway, it all looks good.  I'll apply your patch and submit it to
-stable for 2.6.18.x.

Thanks,
  Roland




More information about the general mailing list