[Users] IPoIB not working on Windows 2008 r2 - need help

Orion Poplawski orion at cora.nwra.com
Fri Jun 7 13:35:36 PDT 2013


On 06/07/2013 02:23 PM, Hal Rosenstock wrote:

>         Looks like your 2 subnets are "interconnected" so they're not really 2
>         disjoint subnets! Is your other subnet 0xfe80::5 ? Looking at your
>         ibnetdiscover file, there's only 1 switch so are you running 2 SMs
>         (one for
>         each subnet) over the same topology. If so, that doesn't work.
>
>
>     I should only have 2 subnets, and we should only be seeing the 0xfe80::1
>     subnet here (there is a 0xfe80::2 subnet that consist only of two machines
>     (amos and andrew) directly connected together).  With the MT25204 windows
>     machine, 5:ad00:c:5ced is the GUID I believe, so it looks like it may have
>     a prefix of 0xfe80::0 ?  I confirmed that the SM service on the windows
>     machine (fontdb) is disabled and stopped.  So I have no idea why it isn't
>     getting a prefix of 0xfe80::1.
>
> Yes, I see now. It does have the default subnet prefix rather than the one you
> configured in the SM. This is evidence of what you asked before which is why
> you probably asked. I don't know whether or not non default subnet prefixes
> work on Windows. Is there any reason you want to run this with other than the
> default subnet prefix ? If not, can you try that and see if things work ?
> While it is legal to have different IB subnets on the same IPoIB subnet, that
> requires an IB router and isn't your intent anyway.


This is one reason I'm running with a non-default subnet ID:

http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid

and I do have some multi-homed machines (amos and andrew above) and may add 
some more.

> Also, if you turn on log verbosity on OpenSM temporarily and send me the log
> for that run, I could see what is going on with in terms of trying to set the
> non default subnet prefix with the Windows node. Given the log you sent, I can
> only imagine that the SMA on the Windows node is ack'ing the PortInfo set
> which sets the subnet prefix but not really acting on it properly.
> -- Hal

There are a lot of different levels for verbosity.  What would be useful (but 
perhaps not too much)?

Thanks!



-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion at nwra.com
Boulder, CO 80301                   http://www.nwra.com



More information about the Users mailing list