[ofw] ping on WinOF

Wed Jul 1 23:51:36 PDT 2009

Thanks for your info -  Rupert Dance.

However it seems you are talking about Linux, while we were talking
about windows, o r am I missing something?

Thanks
Tzachi

________________________________

	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Rupert Dance
	Sent: Wednesday, July 01, 2009 7:27 PM
	To: 'David Brean'; ofw at lists.openfabrics.org
	Subject: RE: [ofw] ping on WinOF

	We have seen issues with IPoIB in datagram mode particularly
when you use a large size (8192 and greater).  This was reported to the
OFA Bugzilla Bug # 1287
<https://bugs.openfabrics.org/show_bug.cgi?id=1287> . Yosef Etigin
looked into this and suggested a workaround that did affect the first
packet drop. Here is his comment:

	It is a network stack limitation and not related ipoib in
particular.

	There's a limit (default = 3) on number of pending skb's before
a neighbour is

	resolved. You can increase it with sysctl
net.ipv4.neigh.ib0.unres_qlen.

	Obviously, same thing happens with Ethernet interface.

	When testing at UNH-IOL for the Logo program, this is what we
did:

	After working with Sasha Khapyorsky on this issue we have a
working fix. To further explain the situation, the large packet sizes we
are using are overflowing the buffers so there is no room to append the
arp request on to the beginning of the cmd. This results in a dropped
packet because the system doesn't know how to get to the destination due
to an empty arp table. The fix, increase the buffer size via:

	sysctl net.ipv4.neigh.ib0.unres_qlen=17 # default is the value 3

	Thanks

	Rupert Dance

	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of David Brean
	Sent: Wednesday, July 01, 2009 11:39 AM
	To: ofw at lists.openfabrics.org
	Subject: [ofw] ping on WinOF

	Hello,

	An internal customer is using WinOF 2.0.X and has reported to me
the following behavior related to IPoIB and ping:

	Do you have any ideas on why windows 2008 client with HCA may
first timeout ping to other clients on the fabric?

	Initially ping fails but then starts working.

	Example :  Ping is invoked three times successfully.

	C:\GRITS>ping -a 192.168.100.235

	Pinging 192.168.100.235 with 32 bytes of data:
	Request timed out.
	Request timed out.
	Request timed out.
	Request timed out.

	Ping statistics for 192.168.100.235:
	   Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),

	C:\GRITS>ping -a 192.168.100.235

	Pinging 192.168.100.235 with 32 bytes of data:
	Request timed out.
	Request timed out.
	Reply from 192.168.100.235: bytes=32 time<1ms TTL=255
	Reply from 192.168.100.235: bytes=32 time<1ms TTL=255

	Ping statistics for 192.168.100.235:
	   Packets: Sent = 4, Received = 2, Lost = 2 (50% loss),
	Approximate round trip times in milli-seconds:
	   Minimum = 0ms, Maximum = 0ms, Average = 0ms

	C:\GRITS>ping -a 192.168.100.235

	Pinging 192.168.100.235 with 32 bytes of data:
	Reply from 192.168.100.235: bytes=32 time<1ms TTL=255
	Reply from 192.168.100.235: bytes=32 time<1ms TTL=255
	Reply from 192.168.100.235: bytes=32 time<1ms TTL=255
	Reply from 192.168.100.235: bytes=32 time<1ms TTL=255

	Ping statistics for 192.168.100.235:
	   Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
	Approximate round trip times in milli-seconds:
	   Minimum = 0ms, Maximum = 0ms, Average = 0ms

	Then we are good for sometime before this starts again if
network is idle on the fabric.

	Has this sort of behavior been observed before?  The Linux and
Solaris nodes sharing the same IP subnet appear to be behaving normally.
Windows server is the "out-of-the-box" configuration with Voltaire
switch configured with only the default partition (0xFFFF).

	-David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090702/b1d077ab/attachment.html>