[ofa-general] retry exceeded problem with rdma_read
Dotan Barak
dotanb at dev.mellanox.co.il
Sat Jan 12 23:04:47 PST 2008
Rajouri Jammu wrote:
> I have the following set both on rdma_connect as well as at rdma_accept.
>
> conn_param.responder_resources = 4;
> conn_param.initiator_depth = 4;
>
> Should initiator_depth be lower for better behavior?
>
Higher values for those attributes means that more outstanding WR of
RDMA Read/Atomic will
be handled.
It doesn't matter which value you put in initiator_depth (for example:
1, 4) as long as it sync with the
responder_resources value.
Dotan
> On Jan 9, 2008 10:16 PM, Dotan Barak < dotanb at dev.mellanox.co.il
> <mailto:dotanb at dev.mellanox.co.il>> wrote:
>
> Rajouri Jammu wrote:
> > Occasionally, I'm getting a retry exceeded error on the qp
> (error 12)
> > when doing rdma_reads.
> >
> > Under what conditions would thins kind of problem happen?
> >
> > I have the retry_count = 5 and 'am using rdma_cm for all the
> > connection setup.
> >
> > OFED version is 1.2.5
> Does it happen between different HCAs?
>
> If this happens during working with the QPs (not in the first message)
> than check the following thing:
>
> If the QP attributes values of max_rd_atomic and max_dest_rd_atomic
> this may happen.
>
> The values should be (for sides A and B):
> A.max_rd_atomic <= B.max_dest_rd_atomic
> A.max_dest_rd_atomic >= B.max_rd_atomic
>
> (which means that RDMA Reads/atomic as initiator shouldn't be larger
> than the supported value as the destination)
>
> You can check it by query the used QP and verify those values.
>
>
>
> If it happens at the beginning of the connection, there may be other
> problem and i need more info ....
>
> Dotan
>
>
More information about the general
mailing list