[openib-general] Reserved L_Key API (was Re: DMA mapping on sparc64)

Michael S. Tsirkin mst at mellanox.co.il
Tue Sep 14 21:34:55 PDT 2004


Hello!
Quoting r. Roland Dreier (roland at topspin.com) "[openib-general] Reserved L_Key API (was Re: DMA mapping on sparc64)":
> Based on Tom's sparc64 testing, I'd like to design an API for
> consumers (MAD layer, IPoIB, etc) who want to do local DMA to
> arbitrary addresses.  Our current hack of registering all of memory by
> assuming that DMA addresses will be between 0 and (high_memory -
> PAGE_OFFSET) is not valid (as sparc64 shows) and probably won't be
> accepted into the kernel.
> 
> For new HCAs that support the base memory management extensions, the
> consumer can just use the reserved L_Key.  It is almost possible to
> simulate this with Tavor: one can create a memory region that does not
> perform any address translation (and just uses the address given in a
> work request as a PCI bus address), but it is not possible to turn off
> PD enforcement.
> 
> This means we need an API that allows a consumer to get a "no
> translation" MR for a given PD.  My proposal would be as follows:
> 
> The low-level driver entry point would just be:
> 
> 	struct ib_mr *(*get_dma_mr)(struct ib_pd *);
> 
> And the client-exposed entry point:
> 
> 	struct ib_mr *ib_get_dma_mr(struct ib_pd *);
> 
> Only the L_Key of this MR would be valid, and it would always have
> local write access (to match the semantics of reserved L_Key).  If the
> HCA supports reserved L_Key, it can just return the same L_Key for
> every consumer.  If need be it can take the PD into account.
> 
> It is required for the consumer to call ib_dereg_mr() on this MR when
> exiting, but this can be a NOP for HCAs that support reserved L_Key.
> 
> I would argue that this entry point should replace reg_phys_mr as a
> mandatory low-level driver function; this will simplify the
> implementation of consumers that use the API.  Devices that can't even
> simulate reserved L_Key like Tavor (and I don't know of any such
> devices -- even on Topspin's embedded platforms I could implement this
> API) could just register a giant address range in a normal physical MR
> (and even use pci_set_dma_mask() to limit the size of the MR to 4 GB
> if they're really limited).
> 
> Comments?  Better naming ideas?
> 
> Thanks,
>   Roland

Dont you want to basically create a physical memory region 
covering the whole 64 bit range, and then post full phy addresses
in the WQE?
Cant you do exactly that with existing API?

Thanks,
  MST



More information about the general mailing list