[openib-general] [PATCH] use mmiowb after doorbell ring
Roland Dreier
rdreier at cisco.com
Tue Oct 17 14:22:04 PDT 2006
> Very strange. Let's consider amd64: libibverbs has
>
> #elif defined(__x86_64__)
>
> #define mb() asm volatile("" ::: "memory")
>
> So its just a compiler barrier there.
>
> While linux has asm-x86_64/system.h
>
> #define rmb() asm volatile("lfence":::"memory")
>
> So rmb seems to be stronger than mb: it will prevent the CPU from reordering
> reads while mb won't.
OK, that's a difference between the kernel and libibverbs -- and it
may be a bug. I have a faint memory of deciding when I wrote the code
that mfence/lfence were only needed for dealing with non-temporal
stores, but looking at asm-x86_64 I see
/*
* Force strict CPU ordering.
* And yes, this is required on UP too when we're talking
* to devices.
*/
#define mb() asm volatile("mfence":::"memory")
#define rmb() asm volatile("lfence":::"memory")
so maybe this is wrong. I know that x86 can do loads speculatively
and out of order, so perhaps we are living dangerously.
Another confusing thing is that asm-i386 defines mb() and rmb() just
to be compiler barriers, but I would think that the same ordering
issues apply in 32-bit mode. But of course not all x86 processors
support lfence/mfence which leads to some ugly issues of how to handle
this -- runtime detection seems important but I don't know a good way
to do that. Probably the best thing would be just to do "lock; addl
$0,0(%%esp)" by default and add a special compile flag or something to
enable mfence.
- R.
More information about the general
mailing list