[openib-general] [PATCH 3/7] IB/ipoib - Use the new verbs DMA mapping functions

Michael S. Tsirkin mst at mellanox.co.il
Mon Nov 6 10:04:43 PST 2006


Quoting r. Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [PATCH 3/7] IB/ipoib - Use the new verbs DMA mapping functions
> 
>  > Hmm, since ib_dma_unmap_single calls a function through a pointer,
>  > this seems to introduce overhead on data path operations in ipoib.
>  > For apps like ipoib always working with low memory, I think it is important
>  > to avoid this overhead of extra indirect function calls at least on systems
>  > without IO MMU - where e.g. dma_unmap_single is empty.
>  > This probably means you need some of architecture-dependent code,
>  > but should be possible - look at how dma API is implemented for an example.
>  > And this applies to all ULPs on systems without high memory.
> 
> How is this possible?
> The IOMMU might be detected at runtime,

E.g. on i386 dma_unmap_single seems to always be empty, I don't think it can be
added at runtime. But I agree x86_64 is more important.

> and you can always have a system with multiple HCAs of different types, so I
> don't see how the conditional can be avoided.  It is unfortunate but in this
> case I think we have to accept the cost of making the code general.

Well, in general case you are right, of course, and the problem is not solvable.
But consider IPoIB, or any ULP that deals with low memory only,
or mostly. It already has the data virtual address - so, why isn't it possible
to pass that down to verbs somehow, and, in this case, avoid the extra overhead?

For example, ib_send_wr/ib_recv_wr could have an *optional* "void *data" field.
And we could have a rule that ULP must *either* pass in the void *data, and
do dma mappings through the usual dma API, or go through the ib_dma mappings (or both).

This would
1. avoid overhead for that ULP, as it would pass in real dma addresses and
   ipath can simply ignore them and use the data pointer instead.
2. allow optimisations such as inline data for HCAs that support both
   dma and copy modes

> 
> It is sad that ipath is likely the only driver that will ever use
> this.  Maybe something that the speed-freaks would like would be to
> add a hidden config option that turns all the ib_dma_xxx stuff into
> NOP macros unless ipath is being built.  Of course that doesn't help
> all that much because all the distros etc will enable ipath.

I think the most generic HCA is capable of both DMA and direct copy by driver.
So, how about implementing something like the proposal above so that ipath
is *not* the only driver to use this?

> Anyway, I suspect the penalty is near-zero anyway, since the pointer
> being tested will likely be in cache and the branch predictor will
> learn which way the branch goes.  (Except on a heterogeneous system I
> suppose)

Hmm. Maybe.
Some numbers demonstrating this for e.g. ipoib might be useful.

-- 
MST




More information about the general mailing list