[ofa-general] Infiniband Problems
David Robb
DavidRobb at comsci.co.uk
Thu Jun 21 12:37:41 PDT 2007
Roland Dreier wrote:
> > 1. Sometimes observe RDMA data transfer stalls of ~ 1.0 second
>
> Could it be an RNR NAK? You didn't really describe your protocol, but
> if you use send operations and if you do a send without a matching
> receive on the other side, then you might end up stalling the QP for a
> while.
>
Quite possibly, we are using an IBV_QPT_RC transport type. The code
simply adds another work request with ibv_post_srq_recv(...) after each
packet is processed. Am I correct in thinking it should start out with a
stack of work requests in case another packet arrives before the current
one has been processed?
> > 2. Creation of a Queue Pair is rejected when I have mapped a region of
> > memory greater than about 1.35GB.
>
> I don't really understand this problem. Are you able to map more
> memory, and then ibv_create_qp() fails if you do? Later you say
>
> > Ideally, we would like the to be able to write anywhere within a 2GB
> > (or larger) shared memory segment. However, when I attempt to do this,
> > the call to fails with REJ.
>
> You didn't say which call fails with REJ, and I'm not even sure I
> understand what it means to "fail with REJ".
>
Sorry, I meant to look up in my source code which call was failing but
forgot to paste it into the question. Yes, I can map 2GB of memory but
the call to ibv_create_qp() fails with REJ
> On x86-64, the limit on how much memory you can register should be
> much higher, closer to 32 GB by default.
>
That's reassuring. Are there any performance penalties for mapping a
larger region than a smaller region?
> - R.
>
Many thanks for the speedy response.
David Robb
More information about the general
mailing list