[ofa-general] retry exceeded problem with rdma_read

Sat Jan 12 23:04:47 PST 2008

Rajouri Jammu wrote:
> I have the following set both on rdma_connect as well as at rdma_accept.
>
>         conn_param.responder_resources = 4;
>         conn_param.initiator_depth = 4;
>
> Should initiator_depth be lower for better behavior?
>
Higher values for those attributes means  that more outstanding WR of 
RDMA Read/Atomic will
be handled.

It doesn't matter which value you put in initiator_depth (for example: 
1, 4) as long as it sync with the
responder_resources value.

Dotan

> On Jan 9, 2008 10:16 PM, Dotan Barak < dotanb at dev.mellanox.co.il 
> <mailto:dotanb at dev.mellanox.co.il>> wrote:
>
>     Rajouri Jammu wrote:
>     > Occasionally, I'm getting a retry exceeded error on the qp
>     (error 12)
>     > when doing rdma_reads.
>     >
>     > Under what conditions would thins kind of problem happen?
>     >
>     > I have the retry_count = 5 and 'am using rdma_cm for all the
>     > connection setup.
>     >
>     > OFED version is 1.2.5
>     Does it happen between different HCAs?
>
>     If this happens during working with the QPs (not in the first message)
>     than check the following thing:
>
>     If the QP attributes values of  max_rd_atomic and max_dest_rd_atomic
>     this may happen.
>
>     The values should be (for sides A and B):
>     A.max_rd_atomic         <= B.max_dest_rd_atomic
>     A.max_dest_rd_atomic >= B.max_rd_atomic
>
>     (which means that RDMA Reads/atomic as initiator shouldn't be larger
>     than the supported value as the destination)
>
>     You can check it by query the used QP and verify those values.
>
>
>
>     If it happens at the beginning of the connection, there may be other
>     problem and i need more info  ....
>
>     Dotan
>
>