[ofw] FW: Problem on multicast flow

Tzachi Dar tzachid at mellanox.co.il
Mon Nov 3 09:20:14 PST 2008


Resending without the executables attached (due to antivirus
enforcement)
 
Any one who wants the executables please call me directly.
 
Thanks
Tzachi


________________________________

From: Tzachi Dar 
Sent: Monday, November 03, 2008 7:16 PM
To: ofw at lists.openfabrics.org; 'slavas at voltaire.com'
Subject: Problem on multicast flow


Hi Voltaire and anyone who can help !
 
On the last day we have been working on a problem of a simple multicast
test that doesn't work.
Test is attached at the end of the mail.
 
This test has used to work in the past but not any more.
 
Looking at the current state of things, it seems that changes that were
done on chekin 1450
are the root of the problem.
 
It seems that the mechanism that maps Mac addresses and ip addresses
into IB multicast was broken.
This was done when ipoib_port_join_mcast has changed from:
 
  mcast_req.member_rec.mgid.raw[12] = 0;//mac.addr[1];
  mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
  mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
  mcast_req.member_rec.mgid.raw[15] = mac.addr[5];

to 
  mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
  mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
  mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
  mcast_req.member_rec.mgid.raw[15] = mac.addr[5];

It seems that now mac_addr[1] is not always 0 as it used to be. Instead
this data is being taken from the ip addresses.
 
More than that, it seems that on the function ipoib_refresh_mcast the
lines 
   if ( ( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1] == 0 &&
p_mac_array[i].addr[2] == 0x5e &&
       p_mac_array[i].addr[3] == 0 && p_mac_array[i].addr[4] == 0 &&
p_mac_array[i].addr[5] == 1 ) ||
      !( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1] == 0 &&
p_mac_array[i].addr[2] == 0x5e )
 
that were added actually means that for normal multicast addresses
(starting with 01-00-5e) no multicast group will be created.
 
The attached patch, fixes my specific test, but might cause problems to
other scenarios. It is not a fix, but rather trying to show the problem
more clearly.
 
A few more interesting points:
 
1) IP multicast addresses are wider than Mac addresses. We need to
decide what encoding we want to use. see 
http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/intw
ork/inaf_mul_wrfn.mspx?mfr=true for example.
Please note that it might be that some ip multicast address will have to
share the same Mac addresses.
 
2) On the same machine when running on Broadcom cards IGMP v2 is used.
On ipoib cards IGMP v3 is used
 
To run the test:
reciver: mcastrcv.exe 11.4.12.85 19007 239.0.0.2 25 406 99
 
sender: mcastsnd.exe 11.4.12.86 19007 239.0.0.2 25 406 100
Please replace the ip 11.4.12.85/6 with the local IPOIB addreses.
 
 
Index: ipoib_adapter.c
===================================================================
--- ipoib_adapter.c (revision 3408)
+++ ipoib_adapter.c (working copy)
@@ -817,6 +817,18 @@
  uint8_t    i, j;
  ipoib_port_t  *p_port = NULL;
 
+    for (i=0; i< num_macs; i++) {
+        DbgPrint("entry %d, mac = %d-%d-%d-%d-%d-%d\n", i, 
+            p_mac_array[i].addr[0],
+            p_mac_array[i].addr[1],
+            p_mac_array[i].addr[2],
+            p_mac_array[i].addr[3],
+            p_mac_array[i].addr[4],
+            p_mac_array[i].addr[5]
+        );
+    }
+
+
  IPOIB_ENTER( IPOIB_DBG_MCAST );
  cl_obj_lock( &p_adapter->obj );
  if( p_adapter->state == IB_PNP_PORT_ACTIVE )
@@ -859,11 +871,15 @@
 
    if( j != p_adapter->mcast_array_size )
     continue;
+/*
    if ( ( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1] == 0 &&
p_mac_array[i].addr[2] == 0x5e &&
        p_mac_array[i].addr[3] == 0 && p_mac_array[i].addr[4] == 0 &&
p_mac_array[i].addr[5] == 1 ) ||
       !( p_mac_array[i].addr[0] == 1 && p_mac_array[i].addr[1] == 0 &&
p_mac_array[i].addr[2] == 0x5e )
-    )
+    )*/
+    
    {
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
+   
     ipoib_port_join_mcast( p_port, p_mac_array[i],
IB_MC_REC_STATE_FULL_MEMBER );
    }
   }
@@ -877,6 +893,8 @@
  if( p_port )
   ipoib_port_deref( p_port, ref_refresh_mcast );
 
+DbgPrint("ipoib_refresh_mcast exiting\n");
+
  IPOIB_EXIT( IPOIB_DBG_MCAST );
 }
 
@@ -1109,6 +1127,7 @@
   /* Join all programmed multicast groups. */
   for( i = 0; i < p_adapter->mcast_array_size; i++ )
   {
+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
    ipoib_port_join_mcast(
     p_adapter->p_port, p_adapter->mcast_array[i]
,IB_MC_REC_STATE_FULL_MEMBER);
   }
Index: ipoib_driver.c
===================================================================
--- ipoib_driver.c (revision 3408)
+++ ipoib_driver.c (working copy)
@@ -1731,24 +1731,25 @@
 
  /* Required Ethernet operational characteristics */
  case OID_802_3_MULTICAST_LIST:
+        DbgPrint("OID_802_3_MULTICAST_LIST called\n");
   IPOIB_PRINT(TRACE_LEVEL_INFORMATION, IPOIB_DBG_OID,
    ("Port %d received set for OID_802_3_MULTICAST_LIST\n", port_num) );
   if( info_buf_len > MAX_MCAST * sizeof(mac_addr_t) )
   {
-   IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
     ("Port %d OID_802_3_MULTICAST_LIST - Multicast list full.\n",
port_num) );
    status = NDIS_STATUS_MULTICAST_FULL;
    *p_bytes_needed = MAX_MCAST * sizeof(mac_addr_t);
   }
   else if( info_buf_len % sizeof(mac_addr_t) )
   {
-   IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
     ("Port %d OID_802_3_MULTICAST_LIST - Invalid input buffer.\n",
port_num) );
    status = NDIS_STATUS_INVALID_DATA;
   }
   else if( !info_buf && info_buf_len )
   {
-   IPOIB_PRINT( TRACE_LEVEL_INFORMATION,IPOIB_DBG_OID,
+   IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
     ("Port %d OID_802_3_MULTICAST_LIST - Invalid input buffer.\n",
port_num) );
    status = NDIS_STATUS_INVALID_DATA;
   }
Index: ipoib_port.c
===================================================================
--- ipoib_port.c (revision 3411)
+++ ipoib_port.c (working copy)
@@ -3243,7 +3243,7 @@
 
  IPOIB_ENTER( IPOIB_DBG_SEND );
 
- IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
+ IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
     ("buf_len = %d,iph_options_size =
%d\n",(int)buf_len,(int)iph_options_size ) );
 
  if( !buf_len )
@@ -3265,6 +3265,7 @@
      ("Failed to query IGMPv2 header buffer.\n") );
     return NDIS_STATUS_FAILURE;
    }
+   CL_ASSERT(iph_options_size >= buf_len);
    iph_options_size-=buf_len;
   }
         
@@ -3312,8 +3313,10 @@
    Change type of mcast endpt to SEND_RECV endpt. So mcast garbage
collector 
    will not delete this mcast endpt.
   */
-  IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
-   ("Catched IGMP_V2_MEMBERSHIP_REPORT message\n") );
+  IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
+   ("Catched IGMP_V2_MEMBERSHIP_REPORT message fake_addr =
%d-%d-%d-%d-%d-%d\n",
+   fake_mcast_mac.addr[0], fake_mcast_mac.addr[1],
fake_mcast_mac.addr[2],
+   fake_mcast_mac.addr[3], fake_mcast_mac.addr[4],
fake_mcast_mac.addr[5]) );
   endpt_status = __endpt_mgr_ref( p_port, fake_mcast_mac, &p_endpt );
   if ( p_endpt )
   {
@@ -3347,7 +3350,7 @@
   break;
 
  default:
-  IPOIB_PRINT( TRACE_LEVEL_INFORMATION, IPOIB_DBG_MCAST,
+  IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_MCAST,
         ("Send Unknown IGMP message: 0x%x \n", p_igmp_v2_hdr->type ) );
   break;
  }
@@ -3815,6 +3818,7 @@
  if( status == NDIS_STATUS_NO_ROUTE_TO_DESTINATION &&
   ETH_IS_MULTICAST( p_eth_hdr->dst.addr ) )
  {
+  IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
   if( ipoib_port_join_mcast( p_port, p_eth_hdr->dst, 
    IB_MC_REC_STATE_FULL_MEMBER) == IB_SUCCESS )
   {
@@ -4248,6 +4252,7 @@
 
    if( ETH_IS_MULTICAST( p_eth_hdr->dst.addr ) )
    {
+    IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,("\n"));
     if( ipoib_port_join_mcast( p_port, p_eth_hdr->dst,
      IB_MC_REC_STATE_FULL_MEMBER) == IB_SUCCESS )
     {
@@ -5894,6 +5899,12 @@
 
  IPOIB_ENTER( IPOIB_DBG_MCAST );
 
+ IPOIB_PRINT( TRACE_LEVEL_ERROR,IPOIB_DBG_OID,
+  ("ipoib_port_join_mcast called MAC %d-%d-%d-%d-%d-%d \n", 
+  mac.addr[0], mac.addr[1], mac.addr[2], 
+  mac.addr[3], mac.addr[4], mac.addr[5] ) );
+
+
  switch( __endpt_mgr_ref( p_port, mac, &p_endpt ) )
  {
  case NDIS_STATUS_NO_ROUTE_TO_DESTINATION:
@@ -5929,7 +5940,8 @@
    * 24 lower bits of that network-byte-ordered value (assuming MSb
    * is zero) and 4 lsb bits of the first byte of IP address.
    */
-  mcast_req.member_rec.mgid.raw[12] = mac.addr[1];
+CL_ASSERT(mac.addr[1] == 0 || mac.addr[1] == 128);
+  mcast_req.member_rec.mgid.raw[12] = 0;//mac.addr[1];
   mcast_req.member_rec.mgid.raw[13] = mac.addr[3];
   mcast_req.member_rec.mgid.raw[14] = mac.addr[4];
   mcast_req.member_rec.mgid.raw[15] = mac.addr[5];

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081103/112d4a06/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcastsnd.c
Type: application/octet-stream
Size: 6918 bytes
Desc: mcastsnd.c
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081103/112d4a06/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcastrcv.c
Type: application/octet-stream
Size: 6766 bytes
Desc: mcastrcv.c
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081103/112d4a06/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipoib_multicast-fix.patch
Type: application/octet-stream
Size: 6178 bytes
Desc: ipoib_multicast-fix.patch
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081103/112d4a06/attachment-0002.obj>


More information about the ofw mailing list