[ofa-general] Re: IPoIB forwarding

Bryan Lawver lawver1 at llnl.gov
Fri Apr 27 15:43:54 PDT 2007


I had so much debugging turned on that it was not the "slowing of the 
traffic" but the "non-coelescencing" that was the remedy.  The NIC is a 
MyriCom NIC and these are easy options to set.


At 03:32 PM 4/27/2007, Rick Jones wrote:
>Bryan Lawver wrote:
>>I hit the IP NIC over the head with a hammer and turned off all offload 
>>features and I no longer get the super jumbo packet and I have symmetric 
>>performance.  This NIC supported "ethtool -K ethx tso/tx/rx/sg on/off" 
>>and I am not sure at this time which one I needed to whack but all off 
>>solved the problem.
>
>Yeah, that does seem like a rather broad remedy, but I guess if it 
>works... :) And I suppose most of those offloads don't matter for a NIC 
>being used in a router.
>
>Only problem is we don't know if it worked because it slowed-down the 10G 
>side or because it had LRO disabling as a side-effect. If I were to guess, 
>of those things listed, I'd guess that receive cko would have that as a 
>side effect.
>
>Just what sort of 10G NIC was this anyway?  With that knowledge we could 
>probably narrow things down to a more specific modprobe setting, or maybe 
>even an ethtool command, for some suitable revision of ethtool.
>
>rick jones
>
>>Thanks for listening and re enforcing my search process.
>>bryan
>>At 01:32 PM 4/27/2007, Rick Jones wrote:
>>
>>>Bryan Lawver wrote:
>>>
>>>>Your right about the ipoib module not combining packets (I believed you 
>>>>without checking) but I did never the less.  The ipoib_start_xmit 
>>>>routine is definitely handed a "double packet"  which means that the IP 
>>>>NIC driver or the kernel is combining two packets into a single super 
>>>>jumbo packet.  This issue is irrespective of the IP MTU setting because 
>>>>I have set all interfaces to 9000k yet  ipoib accepts and forwards this 
>>>>17964 packet to the next IB node and onto the TCP stack where it is 
>>>>never acknowledged.  This may not have come up in prior testing because 
>>>>I am using some of the fastest IP NICs which have no trouble keeping up 
>>>>with or exceeding the bandwidth of the IB side.
>>>>This issue arises exactly every 8 packets...(ring buffer overrun??)
>>>>I will be at Sonoma for the next few days as many on this list will be.
>>>
>>>
>>>
>>>Some NICs (esp 10G) support large receive offload - they coalesce TCP 
>>>segments from the wire/fiber into larger ones they pass up the stack.
>>>Perhaps that is happening here?
>>>
>>>I'm going to go out a bit on a limb, cross the streams, and include 
>>>netdev, because I suspect that if a system is acting as an IP router, 
>>>one doesn't want large receive offload enabled.  That may need some 
>>>discussion in netdev - it may then require some changes to default 
>>>settings or some documentation enhancements.  That or I'll learn that 
>>>the stack is already dealing with the issue...
>>>
>>>rick jones
>>>
>>>>bryan
>>>>
>>>>At 11:06 AM 4/26/2007, Michael S. Tsirkin wrote:
>>>>
>>>>> > Quoting Bryan Lawver <lawver1 at llnl.gov>:
>>>>> > Subject: Re: IPoIB forwarding
>>>>> >
>>>>> > Here's a tcpdump of the same sequence.  The TCP MSS is 8960 and it 
>>>>> appears
>>>>> > that two payloads are queued at ipoib which combines them into a single
>>>>> > 17920 payload with assumingly correct IP header (40) and IB header
>>>>> > (4).  The application or TCP stack does not acknowledge this double 
>>>>> packet
>>>>> > ie. it does not ACK until each of the 8960 packets are resent
>>>>> > individually.  Being an IB newbie, I am guessing this combining is
>>>>> > allowable but may violate TCP protocol.
>>>>>
>>>>>IPoIB does nothing like this - it's just a network device so
>>>>>it sends all packets out as is.
>>>>>
>>>>>--
>>>>>MST
>>>>
>>>>
>>>>_______________________________________________
>>>>general mailing list
>>>>general at lists.openfabrics.org
>>>>http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>To unsubscribe, please visit 
>>>>http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list