[openib-general] How to support IOMMUs for ipath driver
Or Gerlitz
ogerlitz at voltaire.com
Wed Sep 13 02:00:50 PDT 2006
Ralph Campbell wrote:
> Problem:
>
> The IB kernel to IB device driver interface uses dma_map_single()
> and dma_map_sg() to allocate device bus addresses for HW DMA.
> These bus addresses are passed to the IB device driver via ib_post_send()
> and ib_post_recv().
>
> The ib_ipath driver needs kernel virtual addresses in order to be able
> to copy data to/from the posted work requests since it does not
> use HW DMA. It currently relies on the mapping being one-to-one
> and cannot reasonably reverse the mapping when an IOMMU is present.
Oops, please note that one can get through the DMA api a DMA address for
a page which is currently **not** mapped into the kernel virtual address
space (that is page_address(p) is NULL), so you must add kmap and kunmap
into your fast RX/TX code path.
Examples for scenarios when this happen i can think of are Direct I/O
and some sort of pre-fetching done by File-System. Some pages present in
a kernel SG which needs to be sent/received/RDMA-ed over IB need not be
mapped into the kernel virtual address space.
As for RDMA, please note that the problem has two faces, the remote
device which does the RDMA or the local device does RDMA from/to and
second, the local device.
Since you need to be able interop between devices that support DMA
mappings to ones which do not, how do you suggest to manage the
addresses for the following schemes (1 stands for device supporting DMA
addresses and 0 for device which does not)
<1,1>
<1,0>
<0,1>
<0,0>
Please assume for the purpose of discussion that each side knows the
polarity of the remote side?
After writing the section on RDMA i think i might went to the wrong
direction since ipath emulates RDMA in SW, can you shed some light on this?
> I also tried proposing adding a flag to the ib_device structure
> and modifying the kernel IB code to check the flag and pass
> either the dma_*() mapped address or a kernel virtual address.
> This works OK for kmalloc() buffers where dma_map_single() is
> being called but doesn't work well for SRP which has lists
> of physical pages and calls dma_map_sg().
> It also means that the kernel IB layer needs to explicitly handle
> two different kinds of addresses.
Just a note, its not just SRP there... its any ulp which needs to move
over IB data present bunch of pages (eg packed in a kernel SG list),
namely iSER, NFSoRDMA, Lustre, IB native imp of send_page(), etc.
Or.
More information about the general
mailing list