[ofw] RE: Problem on multicast flow
Anatoly Greenblatt
anatolyg at voltaire.com
Tue Nov 4 02:00:35 PST 2008
Hi,
On windows 2008 run "servermanagercmd.exe -install NPAS-RRAS-Services".
This installs routing and remote access packages.
We have clients that already using multicast as you see in production
environment. Changing this not only breaks functionality but requires
additional development in SM or switch (wherever multicast groups are
managed).
Regards,
Anatoly.
________________________________
From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Slava Strebkov
Sent: Tuesday, November 04, 2008 11:39
To: Tzachi Dar; ofw at lists.openfabrics.org
Subject: [ofw] RE: Problem on multicast flow
Hi,
1)
netsh>routing ip igmp
install
add interface "interface name of IPoIB adapter" igmpprototype=igmprtrv2
exit
On 2008 you may need to enable "Routing and Remote Access" from Server
Manager=>Network Policy and Access=> Add Roles.
Before using netsh.
2) We saw interoperability with Linux hosts as well.
Slava
________________________________
From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
Sent: Tuesday, November 04, 2008 11:32 AM
To: Slava Strebkov; ofw at lists.openfabrics.org
Subject: RE: Problem on multicast flow
1) How do you force your servers to use IGMP v2?
2) Will your method interop with Linux as well? I'm looking in the spec
to find a much between ipv4 multicast addresses and IB multicast GID but
with no success.
In any case, even if there is no enforcement in the spec, I'm afraid
that if they are using different mgids than things won't work.
Thanks
Tzachi
________________________________
From: Slava Strebkov [mailto:slavas at voltaire.com]
Sent: Tuesday, November 04, 2008 11:12 AM
To: Tzachi Dar; ofw at lists.openfabrics.org
Subject: RE: Problem on multicast flow
Hi,
The attached test was compiled and run on server 2003 x64 and
server 2008 x86.
I used IPoIB without your patch, as is.
I got at the end
Pass percentage: 100.000000 on both sides, which mean no problem
(Am I right?).
We changed
mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
instead of
mcast_req.member_rec.mgid.raw[12] = 0;
to avoid different IP to be mapped onto same mac.
Using igmpv3 is not recommended since IB join is always made on
224.0.0.22, but not to actual mcast group (e.g. 239.0.0.2).
In our tests we force the servers to use igmp v2.
Please check same test with igmp v2 on both sides.
Slava
________________________________
From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
Sent: Monday, November 03, 2008 7:20 PM
To: ofw at lists.openfabrics.org; Slava Strebkov
Subject: FW: Problem on multicast flow
Resending without the executables attached (due to antivirus
enforcement)
Any one who wants the executables please call me directly.
Thanks
Tzachi
________________________________
From: Tzachi Dar
Sent: Monday, November 03, 2008 7:16 PM
To: ofw at lists.openfabrics.org; 'slavas at voltaire.com'
Subject: Problem on multicast flow
Hi Voltaire and anyone who can help !
On the last day we have been working on a problem of a simple
multicast test that doesn't work.
Test is attached at the end of the mail.
This test has used to work in the past but not any more.
Looking at the current state of things, it seems that changes
that were done on chekin 1450
are the root of the problem.
It seems that the mechanism that maps Mac addresses and ip
addresses into IB multicast was broken.
This was done when ipoib_port_join_mcast has changed from:
mcast_req.member_rec.mgid.raw[12] = 0;//mac.addr[1];
mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
mcast_req.member_rec.mgid.raw[15] = mac.addr[5];
to
mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
mcast_req.member_rec.mgid.raw[15] = mac.addr[5];
It seems that now mac_addr[1] is not always 0 as it used to be.
Instead this data is being taken from the ip addresses.
More than that, it seems that on the function
ipoib_refresh_mcast the lines
if ( ( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1]
== 0 && p_mac_array[i].addr[2] == 0x5e &&
p_mac_array[i].addr[3] == 0 && p_mac_array[i].addr[4] ==
0 && p_mac_array[i].addr[5] == 1 ) ||
!( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1]
== 0 && p_mac_array[i].addr[2] == 0x5e )
that were added actually means that for normal multicast
addresses (starting with 01-00-5e) no multicast group will be created.
The attached patch, fixes my specific test, but might cause
problems to other scenarios. It is not a fix, but rather trying to show
the problem more clearly.
A few more interesting points:
1) IP multicast addresses are wider than Mac addresses. We need
to decide what encoding we want to use. see
http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/intw
ork/inaf_mul_wrfn.mspx?mfr=true for example.
Please note that it might be that some ip multicast address will
have to share the same Mac addresses.
2) On the same machine when running on Broadcom cards IGMP v2 is
used. On ipoib cards IGMP v3 is used
To run the test:
reciver: mcastrcv.exe 11.4.12.85 19007 239.0.0.2 25 406 99
sender: mcastsnd.exe 11.4.12.86 19007 239.0.0.2 25 406 100
Please replace the ip 11.4.12.85/6 with the local IPOIB
addreses.
Index: ipoib_adapter.c
===================================================================
--- ipoib_adapter.c (revision 3408)
+++ ipoib_adapter.c (working copy)
@@ -817,6 +817,18 @@
uint8_t i, j;
ipoib_port_t *p_port = NULL;
+ for (i=0; i< num_macs; i++) {
+ DbgPrint("entry %d, mac = %d-%d-%d-%d-%d-%d\n", i,
+ p_mac_array[i].addr[0],
+ p_mac_array[i].addr[1],
+ p_mac_array[i].addr[2],
+ p_mac_array[i].addr[3],
+ p_mac_array[i].addr[4],
+ p_mac_array[i].addr[5]
+ );
+ }
+
+
IPOIB_ENTER( IPOIB_DBG_MCAST );
cl_obj_lock( &p_adapter->obj );
if( p_adapter->state == IB_PNP_PORT_ACTIVE )
@@ -859,11 +871,15 @@
if( j != p_adapter->mcast_array_size )
continue;
+/*
if ( ( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1]
== 0 && p_mac_array[i].addr[2] == 0x5e &&
p_mac_array[i].addr[3] == 0 && p_mac_array[i].addr[4] ==
0 && p_mac_array[i].addr[5] == 1 ) ||
!( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1]
== 0 && p_mac_array[i].addr[2] == 0x5e )
- )
+ )*/
+
{
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
+
ipoib_port_join_mcast( p_port, p_mac_array[i],
IB_MC_REC_STATE_FULL_MEMBER );
}
}
@@ -877,6 +893,8 @@
if( p_port )
ipoib_port_deref( p_port, ref_refresh_mcast );
+DbgPrint("ipoib_refresh_mcast exiting\n");
+
IPOIB_EXIT( IPOIB_DBG_MCAST );
}
@@ -1109,6 +1127,7 @@
/* Join all programmed multicast groups. */
for( i = 0; i < p_adapter->mcast_array_size; i++ )
{
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
ipoib_port_join_mcast(
p_adapter->p_port, p_adapter->mcast_array[i]
,IB_MC_REC_STATE_FULL_MEMBER);
}
Index: ipoib_driver.c
===================================================================
--- ipoib_driver.c (revision 3408)
+++ ipoib_driver.c (working copy)
@@ -1731,24 +1731,25 @@
/* Required Ethernet operational characteristics */
case OID_802_3_MULTICAST_LIST:
+ DbgPrint("OID_802_3_MULTICAST_LIST called\n");
IPOIB_PRINT(TRACE_LEVEL_INFORMATION, IPOIB_DBG_OID,
("Port %d received set for OID_802_3_MULTICAST_LIST\n",
port_num) );
if( info_buf_len > MAX_MCAST * sizeof(mac_addr_t) )
{
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
("Port %d OID_802_3_MULTICAST_LIST - Multicast list
full.\n", port_num) );
status = NDIS_STATUS_MULTICAST_FULL;
*p_bytes_needed = MAX_MCAST * sizeof(mac_addr_t);
}
else if( info_buf_len % sizeof(mac_addr_t) )
{
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
("Port %d OID_802_3_MULTICAST_LIST - Invalid input
buffer.\n", port_num) );
status = NDIS_STATUS_INVALID_DATA;
}
else if( !info_buf && info_buf_len )
{
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
("Port %d OID_802_3_MULTICAST_LIST - Invalid input
buffer.\n", port_num) );
status = NDIS_STATUS_INVALID_DATA;
}
Index: ipoib_port.c
===================================================================
--- ipoib_port.c (revision 3411)
+++ ipoib_port.c (working copy)
@@ -3243,7 +3243,7 @@
IPOIB_ENTER( IPOIB_DBG_SEND );
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
+ IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
("buf_len = %d,iph_options_size =
%d\n",(int)buf_len,(int)iph_options_size ) );
if( !buf_len )
@@ -3265,6 +3265,7 @@
("Failed to query IGMPv2 header buffer.\n") );
return NDIS_STATUS_FAILURE;
}
+ CL_ASSERT(iph_options_size >= buf_len);
iph_options_size-=buf_len;
}
@@ -3312,8 +3313,10 @@
Change type of mcast endpt to SEND_RECV endpt. So mcast
garbage collector
will not delete this mcast endpt.
*/
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
- ("Catched IGMP_V2_MEMBERSHIP_REPORT message\n") );
+ IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
+ ("Catched IGMP_V2_MEMBERSHIP_REPORT message fake_addr =
%d-%d-%d-%d-%d-%d\n",
+ fake_mcast_mac.addr[0], fake_mcast_mac.addr[1],
fake_mcast_mac.addr[2],
+ fake_mcast_mac.addr[3], fake_mcast_mac.addr[4],
fake_mcast_mac.addr[5]) );
endpt_status = __endpt_mgr_ref( p_port, fake_mcast_mac,
&p_endpt );
if ( p_endpt )
{
@@ -3347,7 +3350,7 @@
break;
default:
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
+ IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
("Send Unknown IGMP message: 0x%x \n",
p_igmp_v2_hdr->type ) );
break;
}
@@ -3815,6 +3818,7 @@
if( status == NDIS_STATUS_NO_ROUTE_TO_DESTINATION &&
ETH_IS_MULTICAST( p_eth_hdr->dst.addr ) )
{
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
if( ipoib_port_join_mcast( p_port, p_eth_hdr->dst,
IB_MC_REC_STATE_FULL_MEMBER) == IB_SUCCESS )
{
@@ -4248,6 +4252,7 @@
if( ETH_IS_MULTICAST( p_eth_hdr->dst.addr ) )
{
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
if( ipoib_port_join_mcast( p_port, p_eth_hdr->dst,
IB_MC_REC_STATE_FULL_MEMBER) == IB_SUCCESS )
{
@@ -5894,6 +5899,12 @@
IPOIB_ENTER( IPOIB_DBG_MCAST );
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
+ ("ipoib_port_join_mcast called MAC %d-%d-%d-%d-%d-%d \n",
+ mac.addr[0], mac.addr[1], mac.addr[2],
+ mac.addr[3], mac.addr[4], mac.addr[5] ) );
+
+
switch( __endpt_mgr_ref( p_port, mac, &p_endpt ) )
{
case NDIS_STATUS_NO_ROUTE_TO_DESTINATION:
@@ -5929,7 +5940,8 @@
* 24 lower bits of that network-byte-ordered value (assuming
MSb
* is zero) and 4 lsb bits of the first byte of IP address.
*/
- mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
+CL_ASSERT(mac.addr[1] == 0 || mac.addr[1] == 128);
+ mcast_req.member_rec.mgid.raw[12] = 0;//mac.addr[1];
mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
mcast_req.member_rec.mgid.raw[15] = mac.addr[5];
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081104/fedd5c8e/attachment.html>
More information about the ofw
mailing list