[ofa-general] IPoIB arps disappearing

Eli Cohen eli at dev.mellanox.co.il
Mon Jul 14 04:11:00 PDT 2008


On Thu, Jul 10, 2008 at 04:30:11AM -0400, Michael Di Domenico wrote:
> I'm having a bit of a weird problem that i cannot figure out.  If anyone can
> help from the community it would be appreciated.
> Here's the packet flow
> 
> cn(ib0)->io(ib0)->io(eth5)->pan(*)
> 
> cn = compute node
> io = io node
> pan = panasas storage network
> 
> We have 12 shelves of panasas network storage on a seperate network, which
> is being fronted by bridge servers which are routing IPoIB traffic to 10G
> ethernet traffic.  We're using Mellanox Connect-X Ethernet/IB adapters
> everwhere.  We're running Ofed 1.3.1 and the latest firmwares for IB/Eth
> everywhere.
> 
> Here's the problem.  I can mount the storage on the compute nodes, but if i
> try to send anything more then 50MB of data via dd.  I seem to loose the ARP
> entries for the compute nodes on the IO servers.  This seems to happen
> whether I use the filesystem or a netperf run from the compute node to the
> panasas storage
> 
> I can run netperf between the compute node and io node and get full IPoIB
> line rate with no issues
> I can run netperf between the io node and the panasas storage and get full
> 10G ethernet line rate with no issues
> 
> When looking at the TCP traces, i can clearly see that a big chunk of data
> is sent between the end-points and then it stalls.  Immediately after the
> stall is an ARP request and then another chunk of data, and this scenario
> repeats over and over.
> 
> Any thoughts or questions?
> 

For the benefit of all - the issue was resolved by loading the mtnic
module with lro=0. It should be a general guidance to disable LRO for
hosts that are configured to route packets. The reason for this is
that LRO aggregates IP packets related to the same TCP stream to a
single IP packet comprised of a list SKBs, and by doing that, it does
not recalculate the TCP checksum. Since routing is done at the IP
layer, the checksum on outgoing packets is not recalculated and the
final destination will receive packets with bas TCP checksum.



More information about the general mailing list