[Users] IPoIB not working on Windows 2008 r2 - need help

Orion Poplawski orion at cora.nwra.com
Fri Jun 7 08:14:20 PDT 2013


On 06/07/2013 04:31 AM, Hal Rosenstock wrote:
>
>
> On Thu, Jun 6, 2013 at 3:32 PM, Orion Poplawski <orion at cora.nwra.com
> <mailto:orion at cora.nwra.com>> wrote:
>
>     I'm trying for the first time to get IPoIB working on one of our Windows
>     servers.  The network is working fine between some Linux machines.  Details:
>
>     InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA]
>     Windows Server 2008 r2
>     MLNX_WinOF_VPI_2_1_2_win7_x64.__msi (as recommended by the mellanox
>     download page for InfiniHost III adapters)
>
>     I don't notice any errors, the adapter shows up fine and I can configure
>     it with a static IP address.  After configuring it (or after boot) I can
>     ping it from another machine for about 10 seconds before it stops
>     responding.  When I ping out from the machine at this point, the icmp
>     packets are being sent out the main ethernet interface (which is a
>     different IP network) and I can see them get to our router.  ibdiagnet
>     does not report any errors.  ipconfig and netstart -r seem fine.
>
>     I see the following in my opensm log:
 >
> IPoIB in Windows deletes from the IPoIB broadcast IB multicast group before
> joining so if that port isn't a member of that MC group you will see this so
> these aren't necessarily "bad" from an SM perspective.

I thought that might be the case, thanks.

> How is your partition file for OpenSM setup ? You should have the ipoib flag
> on for the default partition.
> Which OpenSM are you using here ? A Windows or Linux node ? Which version ?

IPoIB is working fine among my Linux machines, I'm just trying to add Windows 
to the mix.  I'm running opensm 3.3.15 on SL 6.

/etc/rdma/partitions:
Default=0x7fff, ipoib : ALL=full ;


> You should check what saquery -m 0xc000 says after looking at saquery -g to
> make sure that the IPoIB broadcast group ( ff12:401b:ffff::ffff:ffff ) says
> MLID 0xc000.

That is fine on my opensm machine.  I don't seem to have saquery on the 
Windows machine.  I'm going to try switching from MLNX_WinOF_VPI to winOFED 3.1.

> Also, is the HCA really 8x DDR as the NodeDescription appears (Topspin
> DDR-HCAe LX x8) ?

Yes. pcie x8, DDR.

> -- Hal

Thanks.


-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion at nwra.com
Boulder, CO 80301                   http://www.nwra.com



More information about the Users mailing list