[ofa-general] Infiniband Problems

David Robb DavidRobb at comsci.co.uk
Thu Jun 21 12:37:41 PDT 2007


Roland Dreier wrote:
>  > 1. Sometimes observe RDMA data transfer stalls of ~ 1.0 second
>
> Could it be an RNR NAK?  You didn't really describe your protocol, but
> if you use send operations and if you do a send without a matching
> receive on the other side, then you might end up stalling the QP for a
> while.
>   
Quite possibly, we are using an IBV_QPT_RC transport type. The code 
simply adds another work request with ibv_post_srq_recv(...) after each 
packet is processed. Am I correct in thinking it should start out with a 
stack of work requests in case another packet arrives before the current 
one has been processed?
>  > 2. Creation of a Queue Pair is rejected when I have mapped a region of
>  > memory greater than about 1.35GB.
>
> I don't really understand this problem.  Are you able to map more
> memory, and then ibv_create_qp() fails if you do?  Later you say
>
>  > Ideally, we would like the to be able to write anywhere within a 2GB
>  > (or larger) shared memory segment. However, when I attempt to do this,
>  > the call to fails with REJ.
>
> You didn't say which call fails with REJ, and I'm not even sure I
> understand what it means to "fail with REJ".
>   
Sorry, I meant to look up in my source code which call was failing but 
forgot to paste it into the question. Yes, I can map 2GB of memory but 
the call to ibv_create_qp() fails with REJ
> On x86-64, the limit on how much memory you can register should be
> much higher, closer to 32 GB by default.
>   
That's reassuring. Are there any performance penalties for mapping a 
larger region than a smaller region?

>  - R.
>   
Many thanks for the speedy response.

David Robb



More information about the general mailing list