[openib-general] [PATCH] use mmiowb after doorbell ring

Roland Dreier rdreier at cisco.com
Tue Oct 17 14:22:04 PDT 2006


 > Very strange. Let's consider amd64: libibverbs has
 > 
 > #elif defined(__x86_64__)
 > 
 > #define mb()    asm volatile("" ::: "memory")
 > 
 > So its just a compiler barrier there.
 > 
 > While linux has asm-x86_64/system.h
 > 
 > #define rmb()  asm volatile("lfence":::"memory")
 > 
 > So rmb seems to be stronger than mb: it will prevent the CPU from reordering
 > reads while mb won't.

OK, that's a difference between the kernel and libibverbs -- and it
may be a bug.  I have a faint memory of deciding when I wrote the code
that mfence/lfence were only needed for dealing with non-temporal
stores, but looking at asm-x86_64 I see

/*
 * Force strict CPU ordering.
 * And yes, this is required on UP too when we're talking
 * to devices.
 */
#define mb()    asm volatile("mfence":::"memory")
#define rmb()   asm volatile("lfence":::"memory")

so maybe this is wrong.  I know that x86 can do loads speculatively
and out of order, so perhaps we are living dangerously.

Another confusing thing is that asm-i386 defines mb() and rmb() just
to be compiler barriers, but I would think that the same ordering
issues apply in 32-bit mode.  But of course not all x86 processors
support lfence/mfence which leads to some ugly issues of how to handle
this -- runtime detection seems important but I don't know a good way
to do that.  Probably the best thing would be just to do "lock; addl
$0,0(%%esp)" by default and add a special compile flag or something to
enable mfence.

 - R.




More information about the general mailing list