[ewg] [Fwd: Re: [ofa-general] IPOIB/CM increase retry counts]
Eli Cohen
eli at dev.mellanox.co.il
Tue Feb 12 23:22:52 PST 2008
Or Gerlitz wrote:
> Did anyone at Cisco, Qlogic, Mellanox, Voltaire noticed the phenomena
> reported by Shirley on their testing?
>
> Or.
>
> On Tue, 2008-02-12 at 09:00 -0800, Sean Hefty wrote:
>>> Saying all that, I don't think we want to have --any RNR retries--, as
>>> for retries, I am open to hear what others think.
>> I'm really not all that familiar with ipoib protocol, but if it's being
>> implemented over an RC connection, then adding an RNR retry seems to make sense
>> to me. I believe using UC is better, but if it's over RC, I don't know that we
>> want to take the hit of tearing down and re-establishing the connection just
>> because we have a fast sender. (This is just an opinion based on no fact
>> whatsoever.)
>>
>> - Sean
>
> Did anyone ever run IPoIB-CM (multiple sockets and multiple connections)
> between ipath and mthca or connectX and mthca? I guess there might be a
> similar issue there, mismatched send rates.
>
> thanks
> Shirley
I saw cases where a fast sender consumed the TX ring and I solved this by
increasing the size of the tx queue. I will try to connect ConnectX with Sinai
and see if there are such issues.
More information about the ewg
mailing list