[openib-general] Solaris IPoIB MTU with OpenSM

Hal Rosenstock halr at voltaire.com
Wed Feb 16 13:26:14 PST 2005


On Wed, 2005-02-16 at 16:08, Nitin Hande wrote:
> Hal,
> 
> On Wed, 2005-02-16 at 06:27, Hal Rosenstock wrote:
> > On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> > > I have a hunch for whats happening here, but before I jump into any
> > > conclusions, I am seeing some other issue between Solaris IPoIB driver
> > > and OpenSM. After joining the Broadcast group, the PathRecord Response
> > > coming from OpenSM signals an error with Invalid GUID. 
> > 
> > Is the MTU from the PathRecord used ? Is that the theory ? So these are
> > one and the same issue. Thanks.
> No, I was more of thinking of an endian issue between IBD and the layer
> beneath it during the MCMemberRecord response. The mtu is not dependant
> on PathRecord Response. Thanks to Tom, we have figured out a way of
> consistently reproducing this on our systems here. The way to reproduce
> is (basically start everything fresh):
> 1. rmmod {ib_mthca, umad and ipoib}, stop opensm
> 2. unplumb ibd driver and modunload ibd on Solaris,
> 3. modprobe and restart opensm
> 4. plumb ibd interface.
> You should see ibd setting the mtu size to 252. Some of the above steps
> maybe unecessary. From the trace, it looks like OpenSM is reporting 256
> bytes of MTU to ipoib for MCMemberRecord response.
> 
> Here is the trace of 256 sized MTU:
> 
> Outgoing MAD:
>         BaseVersion: 0x1
>         MgmtClass: 0x3 - SubnAdm
>         ClassVersion: 0x2
>         R_Method: 0x12 - SubnAdmGetTable()
>         Status: 0x0 - NO_ERROR
>         ClassSpecific: 0x0
>         TransactionID: 0x97651d100000096
>         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96  .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..............
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  ................
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 
> Incoming MAD:
>         BaseVersion: 0x1
>         MgmtClass: 0x3 - SubnAdm
>         ClassVersion: 0x2
>         R_Method: 0x92 -
>         Status: 0x0 - NO_ERROR
>         ClassSpecific: 0x0
>         TransactionID: 0x97651d100000096
>         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96  .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8........w.....
> 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00  ...L............
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  ................
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00  ................
> 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 
> And on other occassions where OpenSM reports the 2048 sized MTU:
> 
> Outgoing MAD:
>         BaseVersion: 0x1
>         MgmtClass: 0x3 - SubnAdm
>         ClassVersion: 0x2
>         R_Method: 0x12 - SubnAdmGetTable()
>         Status: 0x0 - NO_ERROR
>         ClassSpecific: 0x0
>         TransactionID: 0x97651d10000009a
>         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a  .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..............
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  ................
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 
> Incoming MAD:
>         BaseVersion: 0x1
>         MgmtClass: 0x3 - SubnAdm
>         ClassVersion: 0x2
>         R_Method: 0x92 -
>         Status: 0x0 - NO_ERROR
>         ClassSpecific: 0x0
>         TransactionID: 0x97651d10000009a
>         AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 9a  .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8........w.....
> 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00  ...L............
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  ................
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 04 00  ................
> 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00  ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> [output formatted manually]

These both have the exactly selector issue which I fixed but I think you
haven't picked that up out of the tree. Is that a problem for Solaris or
does it ignore this in the response ?

I have a theory for how the different MTUs (0 (256) and 4 (2048)) occur
but need a little time to validate it.

-- Hal




More information about the general mailing list