[ofw] RE: Problem on multicast flow

Tzachi Dar tzachid at mellanox.co.il
Tue Nov 4 04:48:12 PST 2008


Hi,
 
On server 2003 I have noticed that your proposed system didn't work for
me. It didn't work for sp1 as well as sp2.
This is probably a known bug of the os. Using the method recommended by
MS did the trick (http://support.microsoft.com/default.aspx/kb/815752)
 
For 2008, things worked better. (still there are other opinions on how
to enable igmpv2 - for example:
http://www.windowsreference.com/windows-vista/control-multicast-support-
and-set-igmp-version-in-windows-server-2008-vista/ (didn't check it my
self)) 
 
Please note that there is no need to start rras, only install it.
 
Stan, please add something like this in release notes (probably nothing
that can be done in this release):
 
In this version, multicast on ipoib will only work if the machine is
configured to use IGMP V2 (and not V3 which is the default).
To configure your machine to use IGMP v2 please do the following:
 
on 2003/XP:
netsh routing ip igmp install

netsh routing ip igmp install add interface "interface name of IPoIB
adapter"  igmpprototype=igmprtrv2

If IGMP V3 is still used, please follow the instructions on
(http://support.microsoft.com/default.aspx/kb/815752)

 

on 2008

servermanagercmd.exe -install NPAS-RRAS-Services
netsh routing ip igmp install

netsh routing ip igmp install add interface "interface name of IPoIB
adapter"  igmpprototype=igmprtrv2

 

As for the future releases:

It seems that the functionality of this release is broken. That is
multicast used to work by default and it doesn't.

Can you please add support for the two most common IGMP V3 messages?
("change to exclude mode" and "change to include mode"

 

Thanks

Tzachi

 
 



________________________________

	From: Anatoly Greenblatt [mailto:anatolyg at voltaire.com] 
	Sent: Tuesday, November 04, 2008 12:01 PM
	To: Slava Strebkov; Tzachi Dar; ofw at lists.openfabrics.org
	Cc: Yiftah Shahar
	Subject: RE: [ofw] RE: Problem on multicast flow
	
	

	Hi,

	 

	On windows 2008 run "servermanagercmd.exe -install
NPAS-RRAS-Services". This installs routing and remote access packages.

	We have clients that already using multicast as you see in
production environment. Changing this not only breaks functionality but
requires additional development in SM or switch (wherever multicast
groups are managed).

	 

	Regards,

	Anatoly.

	 

	
________________________________


	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Slava Strebkov
	Sent: Tuesday, November 04, 2008 11:39
	To: Tzachi Dar; ofw at lists.openfabrics.org
	Subject: [ofw] RE: Problem on multicast flow

	 

	Hi,

	1)

	netsh>routing ip igmp

	install

	add interface "interface name of IPoIB adapter"
igmpprototype=igmprtrv2

	exit

	 

	On 2008 you may need to enable "Routing and Remote Access" from
Server Manager=>Network Policy and Access=> Add Roles.

	Before using netsh.

	 

	2) We saw interoperability with Linux hosts as well.

	 

	Slava

	
________________________________


	From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
	Sent: Tuesday, November 04, 2008 11:32 AM
	To: Slava Strebkov; ofw at lists.openfabrics.org
	Subject: RE: Problem on multicast flow

	 

	1) How do you force your servers to use IGMP v2?

	 

	2) Will your method interop with Linux as well? I'm looking in
the spec to find a much between ipv4 multicast addresses and IB
multicast GID but with no success.

	 

	In any case, even if there is no enforcement in the spec, I'm
afraid that if they are using different mgids than things won't work.

	 

	Thanks

	Tzachi

		 

		
________________________________


		From: Slava Strebkov [mailto:slavas at voltaire.com] 
		Sent: Tuesday, November 04, 2008 11:12 AM
		To: Tzachi Dar; ofw at lists.openfabrics.org
		Subject: RE: Problem on multicast flow

		Hi,

		The attached test was compiled and run on server 2003
x64 and server 2008 x86.

		I used IPoIB without your patch, as is.

		I got at the end 

		Pass percentage: 100.000000 on both sides, which mean no
problem (Am I right?).

		 

		We changed 

		  mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
		instead of

		mcast_req.member_rec.mgid.raw[12] = 0;

		to avoid different IP to be mapped onto same mac.

		 

		Using igmpv3 is not recommended since IB join is always
made on 224.0.0.22, but not to actual mcast group (e.g. 239.0.0.2).

		In our tests we force the servers to use igmp v2.

		Please check same test with igmp v2 on both sides.

		 

		 

		Slava

		 

		 

		 

		
________________________________


		From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
		Sent: Monday, November 03, 2008 7:20 PM
		To: ofw at lists.openfabrics.org; Slava Strebkov
		Subject: FW: Problem on multicast flow

		 

		Resending without the executables attached (due to
antivirus enforcement)

		 

		Any one who wants the executables please call me
directly.

		 

		Thanks

		Tzachi

		 

		
________________________________


		From: Tzachi Dar 
		Sent: Monday, November 03, 2008 7:16 PM
		To: ofw at lists.openfabrics.org; 'slavas at voltaire.com'
		Subject: Problem on multicast flow

		Hi Voltaire and anyone who can help !

		 

		On the last day we have been working on a problem of a
simple multicast test that doesn't work.

		Test is attached at the end of the mail.

		 

		This test has used to work in the past but not any more.

		 

		Looking at the current state of things, it seems that
changes that were done on chekin 1450

		are the root of the problem.

		 

		It seems that the mechanism that maps Mac addresses and
ip addresses into IB multicast was broken.

		This was done when ipoib_port_join_mcast has changed
from:

		 

		  mcast_req.member_rec.mgid.raw[12] = 0;//mac.addr[1];
		  mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
		  mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
		  mcast_req.member_rec.mgid.raw[15] = mac.addr[5];

		to 

		  mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
		  mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
		  mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
		  mcast_req.member_rec.mgid.raw[15] = mac.addr[5];

		It seems that now mac_addr[1] is not always 0 as it used
to be. Instead this data is being taken from the ip addresses.

		 

		More than that, it seems that on the function
ipoib_refresh_mcast the lines 

		   if ( ( p_mac_array[i].addr[0] == 1 &&
p_mac_array[i].addr[1] == 0 && p_mac_array[i].addr[2] == 0x5e &&
		       p_mac_array[i].addr[3] == 0 &&
p_mac_array[i].addr[4] == 0 && p_mac_array[i].addr[5] == 1 ) ||
		      !( p_mac_array[i].addr[0] == 1 &&
p_mac_array[i].addr[1] == 0 && p_mac_array[i].addr[2] == 0x5e )

		 

		that were added actually means that for normal multicast
addresses (starting with 01-00-5e) no multicast group will be created.

		 

		The attached patch, fixes my specific test, but might
cause problems to other scenarios. It is not a fix, but rather trying to
show the problem more clearly.

		 

		A few more interesting points:

		 

		1) IP multicast addresses are wider than Mac addresses.
We need to decide what encoding we want to use. see
http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/intw
ork/inaf_mul_wrfn.mspx?mfr=true for example.

		Please note that it might be that some ip multicast
address will have to share the same Mac addresses.

		 

		2) On the same machine when running on Broadcom cards
IGMP v2 is used. On ipoib cards IGMP v3 is used

		 

		To run the test:

		reciver: mcastrcv.exe 11.4.12.85 19007 239.0.0.2 25 406
99

		 

		sender: mcastsnd.exe 11.4.12.86 19007 239.0.0.2 25 406
100

		Please replace the ip 11.4.12.85/6 with the local IPOIB
addreses.

		 

		 

		Index: ipoib_adapter.c
	
===================================================================
		--- ipoib_adapter.c (revision 3408)
		+++ ipoib_adapter.c (working copy)
		@@ -817,6 +817,18 @@
		  uint8_t    i, j;
		  ipoib_port_t  *p_port = NULL;
		 
		+    for (i=0; i< num_macs; i++) {
		+        DbgPrint("entry %d, mac = %d-%d-%d-%d-%d-%d\n",
i, 
		+            p_mac_array[i].addr[0],
		+            p_mac_array[i].addr[1],
		+            p_mac_array[i].addr[2],
		+            p_mac_array[i].addr[3],
		+            p_mac_array[i].addr[4],
		+            p_mac_array[i].addr[5]
		+        );
		+    }
		+
		+
		  IPOIB_ENTER( IPOIB_DBG_MCAST );
		  cl_obj_lock( &p_adapter->obj );
		  if( p_adapter->state == IB_PNP_PORT_ACTIVE )
		@@ -859,11 +871,15 @@
		 
		    if( j != p_adapter->mcast_array_size )
		     continue;
		+/*
		    if ( ( p_mac_array[i].addr[0] == 1 &&
p_mac_array[i].addr[1] == 0 && p_mac_array[i].addr[2] == 0x5e &&
		        p_mac_array[i].addr[3] == 0 &&
p_mac_array[i].addr[4] == 0 && p_mac_array[i].addr[5] == 1 ) ||
		       !( p_mac_array[i].addr[0] == 1 &&
p_mac_array[i].addr[1] == 0 && p_mac_array[i].addr[2] == 0x5e )
		-    )
		+    )*/
		+    
		    {
		+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
		+   
		     ipoib_port_join_mcast( p_port, p_mac_array[i],
IB_MC_REC_STATE_FULL_MEMBER );
		    }
		   }
		@@ -877,6 +893,8 @@
		  if( p_port )
		   ipoib_port_deref( p_port, ref_refresh_mcast );
		 
		+DbgPrint("ipoib_refresh_mcast exiting\n");
		+
		  IPOIB_EXIT( IPOIB_DBG_MCAST );
		 }
		 
		@@ -1109,6 +1127,7 @@
		   /* Join all programmed multicast groups. */
		   for( i = 0; i < p_adapter->mcast_array_size; i++ )
		   {
		+   IPOIB_PRINT(
TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
		    ipoib_port_join_mcast(
		     p_adapter->p_port, p_adapter->mcast_array[i]
,IB_MC_REC_STATE_FULL_MEMBER);
		   }
		Index: ipoib_driver.c
	
===================================================================
		--- ipoib_driver.c (revision 3408)
		+++ ipoib_driver.c (working copy)
		@@ -1731,24 +1731,25 @@
		 
		  /* Required Ethernet operational characteristics */
		  case OID_802_3_MULTICAST_LIST:
		+        DbgPrint("OID_802_3_MULTICAST_LIST called\n");
		   IPOIB_PRINT(TRACE_LEVEL_INFORMATION, IPOIB_DBG_OID,
		    ("Port %d received set for
OID_802_3_MULTICAST_LIST\n", port_num) );
		   if( info_buf_len > MAX_MCAST * sizeof(mac_addr_t) )
		   {
		-   IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
		+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
		     ("Port %d OID_802_3_MULTICAST_LIST - Multicast list
full.\n", port_num) );
		    status = NDIS_STATUS_MULTICAST_FULL;
		    *p_bytes_needed = MAX_MCAST * sizeof(mac_addr_t);
		   }
		   else if( info_buf_len % sizeof(mac_addr_t) )
		   {
		-   IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
		+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
		     ("Port %d OID_802_3_MULTICAST_LIST - Invalid input
buffer.\n", port_num) );
		    status = NDIS_STATUS_INVALID_DATA;
		   }
		   else if( !info_buf && info_buf_len )
		   {
		-   IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
		+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
		     ("Port %d OID_802_3_MULTICAST_LIST - Invalid input
buffer.\n", port_num) );
		    status = NDIS_STATUS_INVALID_DATA;
		   }
		Index: ipoib_port.c
	
===================================================================
		--- ipoib_port.c (revision 3411)
		+++ ipoib_port.c (working copy)
		@@ -3243,7 +3243,7 @@
		 
		  IPOIB_ENTER( IPOIB_DBG_SEND );
		 
		- IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
		+ IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
		     ("buf_len = %d,iph_options_size =
%d\n",(int)buf_len,(int)iph_options_size ) );
		 
		  if( !buf_len )
		@@ -3265,6 +3265,7 @@
		      ("Failed to query IGMPv2 header buffer.\n") );
		     return NDIS_STATUS_FAILURE;
		    }
		+   CL_ASSERT(iph_options_size >= buf_len);
		    iph_options_size-=buf_len;
		   }
		         
		@@ -3312,8 +3313,10 @@
		    Change type of mcast endpt to SEND_RECV endpt. So
mcast garbage collector 
		    will not delete this mcast endpt.
		   */
		-  IPOIB_PRINT( TRACE_LEVEL_INFORMATION,
IPOIB_DBG_MCAST,
		-   ("Catched IGMP_V2_MEMBERSHIP_REPORT message\n") );
		+  IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
		+   ("Catched IGMP_V2_MEMBERSHIP_REPORT message
fake_addr = %d-%d-%d-%d-%d-%d\n",
		+   fake_mcast_mac.addr[0], fake_mcast_mac.addr[1],
fake_mcast_mac.addr[2],
		+   fake_mcast_mac.addr[3], fake_mcast_mac.addr[4],
fake_mcast_mac.addr[5]) );
		   endpt_status = __endpt_mgr_ref( p_port,
fake_mcast_mac, &p_endpt );
		   if ( p_endpt )
		   {
		@@ -3347,7 +3350,7 @@
		   break;
		 
		  default:
		-  IPOIB_PRINT( TRACE_LEVEL_INFORMATION,
IPOIB_DBG_MCAST,
		+  IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
		         ("Send Unknown IGMP message: 0x%x \n",
p_igmp_v2_hdr->type ) );
		   break;
		  }
		@@ -3815,6 +3818,7 @@
		  if( status == NDIS_STATUS_NO_ROUTE_TO_DESTINATION &&
		   ETH_IS_MULTICAST( p_eth_hdr->dst.addr ) )
		  {
		+  IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
		   if( ipoib_port_join_mcast( p_port, p_eth_hdr->dst, 
		    IB_MC_REC_STATE_FULL_MEMBER) == IB_SUCCESS )
		   {
		@@ -4248,6 +4252,7 @@
		 
		    if( ETH_IS_MULTICAST( p_eth_hdr->dst.addr ) )
		    {
		+    IPOIB_PRINT(
TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
		     if( ipoib_port_join_mcast( p_port, p_eth_hdr->dst,
		      IB_MC_REC_STATE_FULL_MEMBER) == IB_SUCCESS )
		     {
		@@ -5894,6 +5899,12 @@
		 
		  IPOIB_ENTER( IPOIB_DBG_MCAST );
		 
		+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
		+  ("ipoib_port_join_mcast called MAC %d-%d-%d-%d-%d-%d
\n", 
		+  mac.addr[0], mac.addr[1], mac.addr[2], 
		+  mac.addr[3], mac.addr[4], mac.addr[5] ) );
		+
		+
		  switch( __endpt_mgr_ref( p_port, mac, &p_endpt ) )
		  {
		  case NDIS_STATUS_NO_ROUTE_TO_DESTINATION:
		@@ -5929,7 +5940,8 @@
		    * 24 lower bits of that network-byte-ordered value
(assuming MSb
		    * is zero) and 4 lsb bits of the first byte of IP
address.
		    */
		-  mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
		+CL_ASSERT(mac.addr[1] == 0 || mac.addr[1] == 128);
		+  mcast_req.member_rec.mgid.raw[12] = 0;//mac.addr[1];
		   mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
		   mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
		   mcast_req.member_rec.mgid.raw[15] = mac.addr[5];

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081104/964dad28/attachment.html>


More information about the ofw mailing list