[ofa-general] IPOIB/CM increase retry counts

Or Gerlitz ogerlitz at voltaire.com
Tue Feb 12 23:20:48 PST 2008


Sean Hefty wrote:
> I'm really not all that familiar with ipoib protocol, but if it's being
> implemented over an RC connection, then adding an RNR retry seems to make sense
> to me.  I believe using UC is better, but if it's over RC, I don't know that we
> want to take the hit of tearing down and re-establishing the connection just
> because we have a fast sender.  (This is just an opinion based on no fact
> whatsoever.)

Hi Sean,

As I see it, the issue here is that from the view point of upper layers 
(TCP, UDP, etc) the IP service is expected to provide unreliable 
service. Hence layers that do need reliability such TCP, add that in 
their protocol, so adding it in the IP layer and below (eg IPoIB or the 
HW it uses) is in a way redundant since the upper layer is not aware to 
that.

For example when a NIC does TCP checksum, then TCP doesn't, while here 
both layers take care of reliability. Also, applications written over 
unreliable layers such as UDP might have negative impact on their 
performance, eg video streaming.

With all that, I am not religiously against adding the retries... 
however, I prefer to understand the original problem which seems to be 
an issue relates to HCA interoperability before putting the solution in 
the code. We both agree that UC is the way to go, and in that case the 
real problem would pop again, but higher layers would have to take care 
of it.

As for your fast send comment, does this means that you see the HCA as 
an entity that does queuing?

Or.





More information about the general mailing list