[openib-general] [RFC/BUG] libibverbs: DMA vs. CQ race

akepner at sgi.com akepner at sgi.com
Wed Dec 13 16:29:52 PST 2006



It appears that there are races between DMA and CQ updates
which can result in incorrect behavior when CQs are allocated
in user-space (via libibverbs).

This problem affects Altix in particular, though it may exist
on other platforms as well. (We haven't really seen this
particular bug yet but, based on previous experience, it's
something that we expect to be manifested on large NUMA
systems.)


Description of the race
-----------------------

On a system such as Altix, that supports "posted DMA", DMA
may complete out of order. (This is due to possible reordering
within the NUMA-interconnect. So it's not a PCI reordering
that's being described here.)

For example, if an HCA does a DMA write to host memory and
then updates a corresponding CQE, it's possible for
the CQE update to be visible before the DMA has actually
completed.

There are a couple of mechanisms to ensure synchronization.
Either: 1) an interrupt, or 2) a write to a "consistently"
(coherently) mapped DMA address will flush in-flight DMA.

When the CQ is allocated by the device driver, mechanism 2)
will prevent the race since "dma_alloc_consistent()" is used
there. But when the CQ allocation is done in user space (via
libibverbs) there's no protection.


So what to do?
-------------

Obviously mechanism 1), generating an interrupt, is not
the right solution for performance reasons.

One proposal is to add a kernel API that enables "coherent
memory" allocation (via the in-kernel DMA interface) from
user-space. Then, CQs, e.g., could be allocated via this
interface, and the race could be avoided.

Any other ideas?


-- 
Arthur






More information about the general mailing list