[ewg] [Fwd: Re: [ofa-general] IPOIB/CM increase retry counts]

Tue Feb 12 23:22:52 PST 2008

Or Gerlitz wrote:
> Did anyone at Cisco, Qlogic, Mellanox, Voltaire noticed the phenomena 
> reported by Shirley on their testing?
> 
> Or.
> 
> On Tue, 2008-02-12 at 09:00 -0800, Sean Hefty wrote:
>>> Saying all that, I don't think we want to have --any RNR retries--, as
>>> for retries, I am open to hear what others think.
>> I'm really not all that familiar with ipoib protocol, but if it's being
>> implemented over an RC connection, then adding an RNR retry seems to make sense
>> to me.  I believe using UC is better, but if it's over RC, I don't know that we
>> want to take the hit of tearing down and re-establishing the connection just
>> because we have a fast sender.  (This is just an opinion based on no fact
>> whatsoever.)
>>
>> - Sean
> 
> Did anyone ever run IPoIB-CM (multiple sockets and multiple connections)
> between ipath and mthca or connectX and mthca? I guess there might be a
> similar issue there, mismatched send rates.
> 
> thanks
> Shirley

I saw cases where a fast sender consumed the TX ring and I solved this by 
increasing the size of the tx queue. I will try to connect ConnectX with Sinai 
and see if there are such issues.