[ofa-general] [PATCHv2] IB/ipoib: S/G and HW checksum support

Michael S. Tsirkin mst at dev.mellanox.co.il
Tue Sep 4 23:19:13 PDT 2007


 
> > We can always go there, and it would be easy to
> > enable this per-destination for datagram mode, too, by setting a
> > bit in HW address, and thus enabling inter-operability with IETF
> > compliant ipoib (at a slower rate, since we'll have to do an extra
> > pass over data to calculate the checksum in software).
> 
> Right, combine this with enforcing correct checksum on all TX
> multicast and this looks much better.

OK, I'll think about it some more, but the option is
definitely there.

> I don't think adding a conditional call to skb_checksum_help in the
> ipoib tx path will not make performance any worse

No, it's skb_checksum_help that will be expensive, unfortunately.

> for those packets
> than it is today - dev_queue_xmit today calls skb_checksum_help on
> behalf of ipoib for every packet.

I don't think it does, normally: the packets it gets now usually
have CHECKSUM_COMPLETE.

> Also, my other thought was about the RX path, it should work more like
> 
>    if (header->flags & cpu_to_be16(IPOIB_HEADER_F_HWCSUM))
>        ip_summed = CHECKSUM_PARTIAL     // Sender says the csum is bad
>    else
>      if (enabled_hw_csum_support)
>         ip_summed = CHECKSUM_UNNECESSARY // Sender says the csum should be good

Hmm. Where does this last line come from? It looks wrong ...

>      else
>         ip_summed = CHECKSUM_NONE;   // Force checking
> 
> (Of course, if the underlying hardware supports checksum offload then
>  the hardware's calculation should just unconditionally be used on the
>  rx path)

Using hardware checksum offload is a separate issue.
For now I'm focusing on working on top of verbs.

> Tx is more like:
> 
> header->flags = 0;
> if (ip_summed == CHECKSUM_PARTIAL)
>    if (destination_is_compatible)
>       header->flags = cpu_to_be16(IPOIB_HEADER_F_HWCSUM);
>    else
>       skb_checksum_help(skb);
> 
> (And again, if the HW supports offload, then don't bother with
>  F_HWCSUM)

It's not that simple: F_HWCSUM is also a hint for RX side,
so it might be a win if the *remote* does not have RX checksum
offloading.

At least for now, I find it much easier to reason about the
hwcsum feature if it behaves more or less identically for
all hadrware.

Once it is upstream, and once Eli's connectx
checksum offloading patches are, we'll try to think about
doing smart tricks like using offloading with some packets
while using F_HWCSUM for others.

> One thing I'm missing here is why care about UD csum performance?
> ConnectX fixes it, and using RC on older cards with something like
> this patch will give best possible speed. Why use an older card and UD
> without csumming and then care about speed? Why not make it RC only
> where it can be done safely and compatibly? RC already gives a good
> uptick over UD, so if you are concerned by speed, you are already
> using it - right?

multicast is where we are forced to datagram mode.

But yes, maybe I should ignore multicast speed for now, and say
that it will get fixed by hardware offloading in the future.

Thanks for the comments.

-- 
MST



More information about the general mailing list