[Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]

Nitin Hande Nitin.Hande at Sun.COM
Tue Mar 15 13:15:57 PST 2005


Hal,

On Fri, 2005-03-04 at 12:53, Hal Rosenstock wrote:
> Hi again Nitin,
> 
> Finally got a chance to work on this. I have a workaround for you for
> now. Real patch later... Let me know if this does the trick for you. It
> did for me.
> 
> -- Hal
> 
> Index: osm_sa_mcmember_record.c
> ===================================================================
> --- osm_sa_mcmember_record.c	(revision 1953)
> +++ osm_sa_mcmember_record.c	(working copy)
> @@ -1522,9 +1522,11 @@
>    if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
>        (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit;
>  
> +#if 0
>    /* if defined MUST match exactly !*/
>    if ((IB_MCR_COMPMASK_MTU_SEL & comp_mask) &&
>        ((p_rcvd_rec->mtu >> 6) != (p_mgrp->mcmember_rec.mtu >> 6))) goto Exit;
> +#endif
>  
>    if ((IB_MCR_COMPMASK_MTU & comp_mask) &&
>        ((p_rcvd_rec->mtu & 0x3F) != (p_mgrp->mcmember_rec.mtu & 0x3F))) goto Exit;
This is cool, I have got Solaris IPoIB happily working with the OpenSM
now. It plumbs, pings and snoops on 0xffff pkey. Here is some output:

[root at dongon ~]# cat /etc/path_to_inst | grep ibd
"/pci at 8,600000/pci at 1/pci15b3,5a44 at 0/ibport at 1,ffff,ipib" 0 "ibd"
"/pci at 8,600000/pci at 1/pci15b3,5a44 at 0/ibport at 2,ffff,ipib" 1 "ibd"
[root at dongon ~]# ifconfig ibd0
ibd0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 2044 index
3
        inet 192.168.100.111 netmask ffffff00 broadcast 192.168.100.255
        ipib 0:0:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 
[root at dongon ~]# ping 192.168.100.112
192.168.100.112 is alive
[root at dongon ~]# snoop -d ibd1
192.168.100.112 -> *            ARP C Who is 192.168.100.111,
192.168.100.111 ?
192.168.100.111 -> 192.168.100.112 ARP R 192.168.100.111,
192.168.100.111 is 0:0:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1
192.168.100.111 -> 192.168.100.112 ICMP Echo request (ID: 641 Sequence
number: 0)
192.168.100.112 -> 192.168.100.111 ICMP Echo reply (ID: 641 Sequence
number: 0)

This is fantastic. Thanks Hal !..

BTW, I have not tested it with multiple GetTable reponse - RMPP packet.
 
On other hand, on my linux node, if I try to use 8001 partition and
configure IB interface with IP addr (same time while ib0 is using 0xffff
pkey), I get the following error, you may want to investigate that....

[root at flopteron2 ~]# echo 0x8001 > /sys/class/net/ib0/create_child
[root at flopteron2 ~]# ifconfig ib0.8001 10.10.1.1
[root at flopteron2ib0.8001: multicast join failed for
ff12:401b:8001:0:0:0:ffff:ffff, status -22
 ~]# ib0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
[root at flopteron2 ~]# ib0.8001: multicast join failed for
ff12:401b:8001:0:0:0:ffff:ffff, status -22
0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff, status
-22
0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff, status
-22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff, status
-22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22
0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff, status
-22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0:ffff:ffff,
status -22

Thanks
Nitin



> 
> 
> -----Forwarded Message-----
> 
> From: Hal Rosenstock <halr at voltaire.com>
> To: Nitin Hande <Nitin.Hande at Sun.COM>
> Cc: openib <openib-general at openib.org>, Tom Duffy <Tom.Duffy at Sun.COM>
> Subject: Re: [openib-general] Solaris IPoIB MTU with OpenSM
> Date: 24 Feb 2005 08:42:23 -0500
> 
> Hi Nitin,
> 
> On Wed, 2005-02-23 at 17:19, Nitin Hande wrote:
> > Hal, 
> > 
> > [comments below]
> > On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote:
> > > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote:
> > > > So I tried the latest patches and preliminarily things seem to be
> > > > working fine. 
> > > 
> > > Yipee.
> > [snip..]
> > > 
> > > > 
> > > > So after this test above, I try to run snoop on the solaris interface
> > > > and get the following error message from the layer below IPoIB:
> > > > 
> > > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE:
> > > > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY ffff Could not get list of
> > > > IBA multicast groups
> > > > 
> > > > My preliminary assumption is that OpenSm is not returning the list of
> > > > multicast groups that the ibd interface has joined. I will look at the
> > > > MAD's tomorrow and try to ascertain that.
> > > 
> > > How does S10 request this ? Remember that if it is a GetTable and
> > > doesn't fit in a single MAD, it will be broken now. If that is the case,
> > > we will live with this until we have real RMPP.
> > Below is an an example of a single GetTable request and response between
> > Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a
> > single request/response.  I have also provided a MAD output between
> > Solaris IPoIB driver and IBSRM single GetTable request response below
> > this example.
> > 
> > Here is the MAD trace between solaris and OpenSM:
> > Outgoing MAD:
> >         BaseVersion: 0x1
> >         MgmtClass: 0x3 - SubnAdm
> >         ClassVersion: 0x2
> >         R_Method: 0x12 - SubnAdmGetTable()
> >         Status: 0x0 - NO_ERROR
> >         ClassSpecific: 0x0
> >         TransactionID: 0x97651d1000000ec
> >         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> >      0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec  .........vQ.....
> > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..............
> > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  ................
> > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00  ................
> > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > Incoming MAD:
> >         BaseVersion: 0x1
> >         MgmtClass: 0x3 - SubnAdm
> >         ClassVersion: 0x2
> >         R_Method: 0x92 -
> >         Status: 0x0 - NO_ERROR
> >         ClassSpecific: 0x0
> >         TransactionID: 0x97651d1000000ec
> >         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> >      0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec  .........vQ.....
> > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8........w.....
> > 20: 00 00 00 14 00 00 00 00 00 00 00 00 00 07 00 00  ................
> > 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  ................
> > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 
> It is likely failing the component checking in
> osm_sa_mcmember_record.c::__osm_sa_mcm_by_comp_mask_cb due to an endian
> issue. Either you can debug this code or I will early next week.
> 
> The component mask in the request is 0x80b4 so the only components
> checked are QKey (0xb1b), MTU (exactly 2048 (4)), PKey (0xffff), and
> scope (2).
> 
> If I don't hear anything by next week, I will work on this then.
> 
> Thanks.
> 
> -- Hal
> 
> > Here is the transaction between IBSRM and Solaris IPoIB driver. 
> > 
> > Outgoing MAD:
> >         BaseVersion: 0x1
> >         MgmtClass: 0x3 - SubnAdm
> >         ClassVersion: 0x2
> >         R_Method: 0x12 - SubnAdmGetTable()
> >         Status: 0x0 - NO_ERROR
> >         ClassSpecific: 0x0
> >         TransactionID: 0x8fecc610000009a
> >         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> >      0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 12 00 00 00 00 08 fe cc 61 00 00 00 9a  ...........a....
> > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..............
> > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  ................
> > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 50: 00 00 00 00 00 00 00 00 81 23 45 68 00 00 84 00  .........#Eh....
> > 60: 80 01 00 00 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > Incoming MAD:
> >         BaseVersion: 0x1
> >         MgmtClass: 0x3 - SubnAdm
> >         ClassVersion: 0x2
> >         R_Method: 0x92 -
> >         Status: 0x0 - NO_ERROR
> >         ClassSpecific: 0x0
> >         TransactionID: 0x8fecc610000009a
> >         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> >      0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 92 00 00 00 00 08 fe cc 61 00 00 00 9a  ...........a....
> > 10: 00 38 00 00 00 00 00 00 01 01 73 00 00 00 00 01  .8........s.....
> > 20: 00 00 01 40 00 00 00 00 00 00 00 00 00 07 00 00  ... at ............
> > 30: 00 00 00 00 00 00 00 00 ff 12 40 1b 80 01 00 00  .......... at .....
> > 40: 00 00 00 00 00 00 00 09 00 00 00 00 00 00 00 00  ................
> > 50: 00 00 00 00 00 00 00 00 81 23 45 68 c0 04 84 00  .........#Eh....
> > 60: 80 01 83 8d 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> > 70: ff 12 40 1b 80 01 00 00 00 00 00 00 00 00 00 01  .. at .............
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 90: 81 23 45 68 c0 03 84 00 80 01 83 8d 00 00 00 00  .#Eh............
> > a0: 20 00 00 00 00 00 00 00 ff 12 40 1b 80 01 00 00   ......... at .....
> > b0: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  ................
> > c0: 00 00 00 00 00 00 00 00 81 23 45 68 c0 00 84 00  .........#Eh....
> > d0: 80 01 83 8d 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> > e0: ff 12 60 1b 80 01 00 00 00 00 00 01 ff 76 5b 01  ..`..........v[.
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 
> > Thanks
> > Nitin
> 
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list