[ofa-general] librdmacm and libmthca question

Wed Oct 10 23:25:26 PDT 2007

Hi.

I can try to answer some of the questions that you have which are 
related to the core/verbs.

Doug Ledford wrote:
> OK, I ran into an issue with librdmacm and I was curious what the
> answers to these issues are.
>
> First, the rdma_connect/rdma_accept functions both require a connection
> param struct.  That struct tells librdmacm what you want in terms of
> responder_resources and initiator_depth.  Reading the man page, that's
> the number of outstanding RMDA reads and RDMA atomic operations.  In
> usage, I found that the QP max_recv_wr and max_send_wr are totally
> unrelated to this (I at first thought they could be the same).  In fact,
> on mthca hardware I found the hard limit to be either 4 or 5 (4 worked,
> 6 didn't, didn't try 5, assumed 4).  So even with a send queue depth of
> 128, I couldn't get above a 4 depth on initiator_depth.  I think it
> might be of value to document somewhere that the initiator depth and
> responder resources are not directly related to the actual work queue
> depth, and that without some sort of intervention, are not that high.
>
> However, I spent a *lot* of time tracking this down because the failure
> doesn't occur until rdma_accept time.  Passing an impossibly high value
> in initiator_depth or responder_resources doesn't fail on rdma_connect.
> This leads one to believe that the values are OK, even though they fail
> when you use the same values in rdma_accept.  A note to this effect in
> the man pages would help.
>
> Second, now that I know that mthca hardware fails with initiator depth
> or responder resources > 4, it raises several unanswered questions:
>
> 1) Can this limit be adjusted by module parameters, and if so, which
> ones?
>   
This value is an attribute of the device (there is an upper limit on how 
many outstanding RDMA Reads/atomic
it supports).
The mthca low level driver is being loaded with default value of 4 
(which is less that the device capability),
 but there is a module parameter called (rdb_per_qp) which can be 
changed to support higher value.

> 2) Does this limit represent the limit on outstanding RMDA READ/Atomic
> operations in a) progress, b) queue, or c) registration?
>   
This value limit the number of RDMA read/Atomic which can be processed 
in parallel in this QP.
for example: you posted 100 RDMA Reads, and the QP was configured to 
support only 4,
so 4 RDMA Reads will be processed every time in parallel, when one will 
be finished, another one
will begin until all of your 100 will be processed: so the answer is a), 
in progress.
> 3) The answer to #2 implies the answer to this, but I would like a
> specific response.  If I attempt to register more IBV_ACCESS_REMOTE_READ
> memory regions than responder resources, what happens?  If I attempt to
> queue more IBV_WR_RDMA_READ work requests than initiator_depth, what
> happens?  If there are more IBV_WR_RDMA_READ requests in queue than
> initiator_depth and it hits the initiator_depth + 1 request while still
> processing the proceeding requests, what happens?
>   
There isn't any connection between the number of Memory Regions that you 
have (it doesn't matter
which permission you registered them with) and the value that you gave 
to the QP to handle RDMA Reads/
Atomic. (A MR can be shared with several QPs)

I hope that i helped you with this info

Dotan