[openib-general] mthca FMR correctness (and memory windows)

Dror Goldenberg gdror at mellanox.co.il
Mon Mar 20 14:09:35 PST 2006



> From: Talpey, Thomas [mailto:Thomas.Talpey at netapp.com] 
> Sent: Monday, March 20, 2006 11:50 PM
> 

> Well, for that, the client code implements a "full-frontal 
> mode" - which just does a single ib_get_dma_mr() with 
> REMOTE_READ|REMOTE_WRITE privs. Then it performs all 
> addressing as offsets against the single rkey.

It's not exactly the same. The important difference is about
scatter/gather.
If you use dma_mr, then you have to send a chunk list from the client to
the server. Then, for each one of the chunks, the server has to post an
RDMA read or write WQE. Also, the typical message size on the wire
will be a page (I am assuming large IOs for the purpose of this
discussion).

Alternatively, if you use FMR, you can take the list of pages, the IO is

comprised of, collapse them into a virtually contiguous memory region,
and use just one chunk for the IO.
This:
- Reduces the amount of WQEs that need to be posted per IO operation
	* lower CPU utilization
- Reduces the amount of messages on the wire and increases their sizes
	* better HCA performance


> 
> What I'm looking for is *alternatives* to this - ones which 
> allow RDMA protection but do not suffer rather extreme 
> performance penalties. It's not even the overhead of the 
> memory registration itself, it's the fact that everything 
> single-threads on the TPT loads/invalidates. Throughput just 
> grinds to tens of MB/sec (from hundreds).
> 

I understand. I was suggesting another method... increasing
IB IO size by creating a virtual space in a MR that represents
the list of pages that come from the IO operation. 



More information about the general mailing list