[Users] IPoIB not working on Windows 2008 r2 - need help

Orion Poplawski orion at cora.nwra.com
Fri Jun 7 12:38:44 PDT 2013


On 06/07/2013 12:56 PM, Hal Rosenstock wrote:

>         Would you send me the output of an ibnetdiscover for your subnet ?
>
>
> Which is SM host ?

saga is the SM host.

>
>     #
>     # Topology file: generated on Fri Jun  7 10:43:36 2013
>     #
>     # Initiated from node 0019bbffff005850 port 0019bbffff005851
>
>     vendid=0x66a
>     devid=0xb924
>     sysimgguid=0x66a00d8000242
>     switchguid=0x66a00d8000242(__66a00d8000242)
>     Switch  24 "S-00066a00d8000242"         # "InfinIO 9024 Switch " enhanced
>     port 0 lid 2 lmc 0
>     [1]     "H-0005ad00000c5c3c"[1](__5ad00000c5c3d)          # "andrew
>     mthca0" lid 15 4xSDR
>     [6]     "H-001708ffffd09df8"[1](__1708ffffd09df9)                 #
>     "alexandria2 HCA-1" lid 4 4xSDR
>     [8]     "H-001708ffffd09df8"[2](__1708ffffd09dfa)                 #
>     "alexandria2 HCA-1" lid 5 4xSDR
>     [10]    "H-0019bbffff005850"[1](__19bbffff005851)                 # "saga
>     mthca0" lid 1 4xSDR
>     [11]    "H-0019bbffff003898"[2](__19bbffff00389a)                 #
>     "sfcomp1 mthca0" lid 9 4xSDR
>     [12]    "H-001a4bffff0c20c8"[1](__1a4bffff0c20c9)                 # "earth
>     mthca0" lid 13 4xSDR
>     [20]    "H-0005ad00000c5cec"[1](__5ad00000c5ced)          # "MT25204
>     InfiniHostLx Mellanox Technologies" lid 16 4xSDR
>     [23]    "H-0019bbffff003898"[1](__19bbffff003899)                 #
>     "sfcomp1 mthca0" lid 8 4xSDR
>
>     vendid=0x2c9
>     devid=0x6274
>     sysimgguid=0x5ad00000c5cef
>     caguid=0x5ad00000c5cec
>     Ca      1 "H-0005ad00000c5cec"          # "MT25204 InfiniHostLx Mellanox
>     Technologies"
>     [1](5ad00000c5ced)      "S-00066a00d8000242"[20]                # lid 16
>     lmc 0 "InfinIO 9024 Switch " lid 2 4xSDR
>
>     vendid=0x1708
>     devid=0x6278
>     sysimgguid=0x1a4bffff0c20cb
>     caguid=0x1a4bffff0c20c8
>     Ca      2 "H-001a4bffff0c20c8"          # "earth mthca0"
>     [1](1a4bffff0c20c9)     "S-00066a00d8000242"[12]                # lid 13
>     lmc 0 "InfinIO 9024 Switch " lid 2 4xSDR
>
>     vendid=0x1708
>     devid=0x6278
>     sysimgguid=0x19bbffff00389b
>     caguid=0x19bbffff003898
>     Ca      2 "H-0019bbffff003898"          # "sfcomp1 mthca0"
>     [1](19bbffff003899)     "S-00066a00d8000242"[23]                # lid 8
>     lmc 0 "InfinIO 9024 Switch " lid 2 4xSDR
>     [2](19bbffff00389a)     "S-00066a00d8000242"[11]                # lid 9
>     lmc 0 "InfinIO 9024 Switch " lid 2 4xSDR
>
>     vendid=0x5ad
>     devid=0x6274
>     sysimgguid=0x5ad00000c5c3f
>     caguid=0x5ad00000c5c3c
>     Ca      1 "H-0005ad00000c5c3c"          # "andrew mthca0"
>     [1](5ad00000c5c3d)      "S-00066a00d8000242"[1]         # lid 15 lmc 0
>     "InfinIO 9024 Switch " lid 2 4xSDR
>
>     vendid=0x1708
>     devid=0x6278
>     sysimgguid=0x1708ffffd09dfb
>     caguid=0x1708ffffd09df8
>     Ca      2 "H-001708ffffd09df8"          # "alexandria2 HCA-1"
>     [1](1708ffffd09df9)     "S-00066a00d8000242"[6]         # lid 4 lmc 0
>     "InfinIO 9024 Switch " lid 2 4xSDR
>     [2](1708ffffd09dfa)     "S-00066a00d8000242"[8]         # lid 5 lmc 0
>     "InfinIO 9024 Switch " lid 2 4xSDR
>
>     vendid=0x1708
>     devid=0x6278
>     sysimgguid=0x19bbffff005853
>     caguid=0x19bbffff005850
>     Ca      2 "H-0019bbffff005850"          # "saga mthca0"
>     [1](19bbffff005851)     "S-00066a00d8000242"[10]                # lid 1
>     lmc 0 "InfinIO 9024 Switch " lid 2 4xSDR
>
>
>
>


>
>     # saquery -m 0xc000
>                      PortGid.................fe80::__1:5:ad00:c:5c3d (Topspin
>     DDR-HCAe LX x8)
>
>                      PortGid.................fe80::__1:19:bbff:ff00:5851 (saga
>     mthca0)
>                      PortGid.................fe80::__1:19:bbff:ff00:3899
>     (sfcomp1 mthca0)
>
>                      PortGid.................fe80::__1:1a:4bff:ff0c:20c9 (HP
>     Lion Cub 128MB)
>                      PortGid.................fe80::__5:ad00:c:5ced (MT25204
>     InfiniHostLx Mellanox Technologies)
>                      PortGid.................fe80::__1:17:8ff:ffd0:9df9
>     (alexandria2 HCA-1)
>
>
>     Seems like I may have two entries for the 5:ad00:c:5ced device?
>
> Looks different to me than  5:ad00:c:5c3d which is Topspin one

Ah, didn't catch that.  The Topspin then is andrew.

>
>     Perhaps updating the firmware led to that (now it is MT25204 instead of
>     Topspin).
>
> Looks like your 2 subnets are "interconnected" so they're not really 2
> disjoint subnets! Is your other subnet 0xfe80::5 ? Looking at your
> ibnetdiscover file, there's only 1 switch so are you running 2 SMs (one for
> each subnet) over the same topology. If so, that doesn't work.

I should only have 2 subnets, and we should only be seeing the 0xfe80::1 
subnet here (there is a 0xfe80::2 subnet that consist only of two machines 
(amos and andrew) directly connected together).  With the MT25204 windows 
machine, 5:ad00:c:5ced is the GUID I believe, so it looks like it may have a 
prefix of 0xfe80::0 ?  I confirmed that the SM service on the windows machine 
(fontdb) is disabled and stopped.  So I have no idea why it isn't getting a 
prefix of 0xfe80::1.


-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion at nwra.com
Boulder, CO 80301                   http://www.nwra.com



More information about the Users mailing list