[openib-general] basic IB doubt
glebn at voltaire.com
glebn at voltaire.com
Mon Aug 28 01:11:08 PDT 2006
On Mon, Aug 28, 2006 at 12:18:49AM -0600, Jason Gunthorpe wrote:
> On Sun, Aug 27, 2006 at 03:30:56PM -0700, Roland Dreier wrote:
> > glebn> So, before touching the data that was RDMAed into the
> > glebn> buffer application should cache invalidate the buffer, is
> > glebn> this even possible from user space? (Not on x86, but it
> > glebn> isn't needed there.)
>
> > Yes, on any architecture that is not cache-coherent with PCI DMA, some
> > cache invalidation/flushing will be necessary. And this probably
> > won't be possible from userspace if the cache is physically tagged.
> > (Are there any such architectures in real use, ie non-coherent with
> > PCI and physically tagged cache?)
>
> It depends on the arch if it is a problem or not.. Ie PPC Book-E
> has 'dcba' which is available from user space. It operates on virtual
> addresses and is a flush and invalidate combined. So it is safe,
> but less effecient than the pure invalidate that the kernel has access
> to.
>
This is from PPC instruction book:
The dcba instruction executes as follows:
If the cache block containing the byte addressed by EA is in the data
cache, the contents of all bytes are made undefined but the cache block is
still considered valid. Note that programming errors can occur if the data
in this cache block is subsequently read or used inadvertently.
If the cache block containing the byte addressed by EA is not in
the data cache and the corresponding memory page or block is caching-allowed,
the cache block is allocated (and made valid) in the data cache without
fetching the block from main memory, and the value of all bytes is undefined.
This doesn't look like this instruction is doing flush or invalidate. It
makes cache line present without accessing underlying memory. AFAIR
uboot uses this + cache locking to create C stack before SDRAM is
initialised.
> So long as cache ops that works on virtual addresses are present
> it should be fine from userspace, but in some cases the necessary
> sequence of cache ops can be quite elaborate and hardware dependent,
> so a syscall, or at least a vdso function would be needed to support
> eveything.
And then you'll need to do syscall for every IB verb.
>
> In my experience most real architectures that have this problem these
> days are embedded targetted lower performance processors. If you are
> in the embedded space and using IB hardware then presumably you care
> about performance and will avoid such things. (Although long ago, this
> wasn't a choice and I actually have built an embedded IB capable
> system with non-coherent PCI.. It is a big pain, I don't recommend it.)
>
Agreed.
--
Gleb.
More information about the general
mailing list