[openib-general] Re: TSO and IPoIB performance degradation
Michael S. Tsirkin
mst at mellanox.co.il
Mon Mar 20 01:06:29 PST 2006
Quoting r. David S. Miller <davem at davemloft.net>:
> > well, there are stacks which do "stretch acks" (after a fashion) that
> > make sure when they see packet loss to "do the right thing" wrt sending
> > enough acks to allow cwnds to open again in a timely fashion.
>
> Once a loss happens, it's too late to stop doing the stretch ACKs, the
> damage is done already. It is going to take you at least one
> extra RTT to recover from the loss compared to if you were not doing
> stretch ACKs.
>
> You have to keep giving consistent well spaced ACKs back to the
> receiver in order to recover from loss optimally.
Is it the case then that this requirement is less essential on
networks such as IP over InfiniBand, which are very low latency
and essencially lossless (with explicit congestion contifications
in hardware)?
> The ACK every 2 full sized frames behavior of TCP is absolutely
> essential.
Interestingly, I was pointed towards the following RFC draft
http://www.ietf.org/internet-drafts/draft-ietf-tcpm-rfc2581bis-00.txt
The requirement that an ACK "SHOULD" be generated for at least every
second full-sized segment is listed in [RFC1122] in one place as a
SHOULD and another as a MUST. Here we unambiguously state it is a
SHOULD. We also emphasize that this is a SHOULD, meaning that an
implementor should indeed only deviate from this requirement after
careful consideration of the implications.
And as Matt Leininger's research appears to show, stretch ACKs
are good for performance in case of IP over InfiniBand.
Given all this, would it make sense to add a per-netdevice (or per-neighbour)
flag to re-enable the trick for these net devices (as was done before
314324121f9b94b2ca657a494cf2b9cb0e4a28cc)?
IP over InfiniBand driver would then simply set this flag.
David, would you accept such a patch? It would be nice to get 2.6.17
back to within at least 10% of 2.6.11.
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
More information about the general
mailing list