[ofw] changes to ib_wc committed to svn

Fab Tillier ftillier at windows.microsoft.com
Thu Jul 24 09:11:36 PDT 2008


Hi Tzachi,

> -----Original Message-----
> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> Sent: Thursday, July 24, 2008 12:58 AM
>
> Hi Fab,
>
>> You've also lost the ability to eliminate receive checksum
>> validation.  IB has 32-bit CRC which would cause any
>> corrupted packets to get dropped. If you assume that packets
>> are being put on the wire correctly, then if they are
>> received at all they haven't been corrupted.
>
> This approach has a few problems.
> First, Ethernet also has CRC in it and still IP check sum is used on
> Ethernet. This means that the users of IP believe that there still
> should be an end to end check.
> Second, your approach works only inside a small cluster that is all
> windows. If the cluster has Linux machines in it they will through the
> packets.

No, the checksum is still calculated and set properly by the sender.  I agree that if you enable send checksum offload and don't perform the calculation then the packets can get discarded at the receiver.  However, if you let the OS do send checksums, you can pass packets to the stack at the receiver as if checksum had been done.  This should interop just fine with Linux since the checksums in the packet should be correct.

> If there is a gateway to Ethernet, that gateway should be
> configured to do the checksum once packets are moved to the Ethernet. I
> don't think there is such a gateway.

This only applies if you don't calculate checksums when putting a packet on the wire.  Receive checksum calculation is independent.

> In any case, I believe that we should have 3 possibilities:
> 1) Never do checksum in hw. Windows will always do checksum.
> 2) Never do checksum by windows. The hw will do checksum if it can or
> send packets with bad checksum if the card doesn't support that.

This one I think can be refined - you can enable receive checksum 'offload' without packet drop.  Enabling send checksum 'offload' without hardware support will put packets on the wire with incorrect checksums though, and I think that should be discouraged.

> 3) [default] Do checksum in HW if the hw supports that and do it by
> software if HW doesn't support that.
>
> This will give anyone the chance to choose the right option for him.

Agreed, this should be configurable.

>> What happens if these options are set and the underlying HCA
>> doesn't support checksum offload?  Does the WR just get sent
>> or does it fail?
>
> In the existing code, this flags are not checked by other drivers so
> things should be fine. This might have to be changed once the 3 modes
> suggested above will be available.

It seems to me that checksum offload should be a QP attribute, not a work request option.  That way clients can do a check to see if the CA attributes indicate the HW supports checksum offload, and then create their QP accordingly.

Ignoring send flags is a bit odd - right now none of the flags get ignored do they?  If someone requests a send with immediate, it's expected that the send will happen with immediate.  Sending the data without immediate data would affect the wire protocol.

>> All in all, I think using the vendor specific field for
>> checksum notifications seems lacking proper design, and more
>> importantly, community input. A few issues from a quick glance:
>> - it should be clear that it matches the
>> NDIS_TCP_IP_CHECKSUM_PACKET_INFO definition.
>
> Indeed it matches this field exactly. This was done to avoid doing un
> needed work on the data path.

A comment to that extent would be good.

>> - either checksum offload is vendor specific or it isn't.  If
>> it is, then:
>>         1. the IB_SEND_OPT_TX_XXX_CSUM values should be in the
>>         IB_SEND_OPT_VEND_MASK range 2. The ib_wc_t structure stays
>>         unchanged 3. IPoIB checks which HCA it runs on and has
>>         HCA-specific code to handle checksum offload. 4. ib_ca_attr_t
>>         is unchanged
>> If it isn't then the vendor_specific field should get
>> overloaded with non-vendor specific information and csum_ok
>> should be put into the recv.ud part of the ib_wc_t structure.
>> - MTHCA needs to be updated so that it properly clears csum_ok.
>  What is your suggestion, do you want to put it as vendor specific or
> not? I suggest to have it non vendor specific.

I think it makes sense as non-vendor specific.  So think I'd like to see:
- checksum offload capabilities are reported in CA attributes.
- checksum offload as QP creation attribute (do you support this for both UD and RC QPs?  That is, can IPoIB-CM take advantage of this too?).  If offload is requested but HW doesn't support, QP creation should fail.  Clients should be expected to check the capabilities of the HW.
- checksum results are reported in ib_wc_t::recv.ud.csum_ok
- ib_wc_t::vendor_specific remains as a uint64_t.

-Fab



More information about the ofw mailing list