[openib-general] Solaris IPoIB MTU with OpenSM
Hal Rosenstock
halr at voltaire.com
Wed Feb 16 13:26:14 PST 2005
On Wed, 2005-02-16 at 16:08, Nitin Hande wrote:
> Hal,
>
> On Wed, 2005-02-16 at 06:27, Hal Rosenstock wrote:
> > On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> > > I have a hunch for whats happening here, but before I jump into any
> > > conclusions, I am seeing some other issue between Solaris IPoIB driver
> > > and OpenSM. After joining the Broadcast group, the PathRecord Response
> > > coming from OpenSM signals an error with Invalid GUID.
> >
> > Is the MTU from the PathRecord used ? Is that the theory ? So these are
> > one and the same issue. Thanks.
> No, I was more of thinking of an endian issue between IBD and the layer
> beneath it during the MCMemberRecord response. The mtu is not dependant
> on PathRecord Response. Thanks to Tom, we have figured out a way of
> consistently reproducing this on our systems here. The way to reproduce
> is (basically start everything fresh):
> 1. rmmod {ib_mthca, umad and ipoib}, stop opensm
> 2. unplumb ibd driver and modunload ibd on Solaris,
> 3. modprobe and restart opensm
> 4. plumb ibd interface.
> You should see ibd setting the mtu size to 252. Some of the above steps
> maybe unecessary. From the trace, it looks like OpenSM is reporting 256
> bytes of MTU to ipoib for MCMemberRecord response.
>
> Here is the trace of 256 sized MTU:
>
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d100000096
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
>
> 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
> 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96 .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8..............
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 ................
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>
> Incoming MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x92 -
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d100000096
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
>
> 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
> 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96 .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8........w.....
> 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00 ...L............
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 ................
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00 ................
> 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00 ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>
> And on other occassions where OpenSM reports the 2048 sized MTU:
>
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d10000009a
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
>
> 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
> 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8..............
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 ................
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>
> Incoming MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x92 -
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d10000009a
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
>
> 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
> 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 9a .........vQ.....
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8........w.....
> 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00 ...L............
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 .......... at .....
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 ................
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 04 00 ................
> 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00 ........ .......
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [output formatted manually]
These both have the exactly selector issue which I fixed but I think you
haven't picked that up out of the tree. Is that a problem for Solaris or
does it ignore this in the response ?
I have a theory for how the different MTUs (0 (256) and 4 (2048)) occur
but need a little time to validate it.
-- Hal
More information about the general
mailing list