[ewg] Infiniband Interoperability

Matt Breitbach matthewb at flash.shanje.com
Wed Jun 30 07:33:45 PDT 2010


Well, let me throw out a little about the environment : 

 

We are running one SuperMicro 4U system with a Mellanox InfiniHost III EX
card w/ 128MB RAM.  This box is the OpenSolaris box.  It's running the
OpenSolaris Infiniband stack, but no SM.  Both ports are cabled to the IB
Switch to ports 1 and 2.

 

The other systems are in a SuperMicro Bladecenter.  The switch in the
BladeCenter is an InfiniScale III switch with 10 internal ports and 10
external ports.

 

3 blades are connected with Mellanox ConnectX Mezzanine cards.  1 blade is
connected with an InfiniHost III EX Mezzanine card.

 

One of the blades is running CentOS and the 1.5.1 OFED release.  OpenSM is
running on that system, and is the only SM running on the network.  This
blade is using a ConnectX Mezzanine card.

 

One blade is running Windows 2008 with the latest OFED drivers installed.
It is using an InfiniHost III EX Mezzanine card.

 

One blade is running Windows 2008 R2 with the latest OFED drivers installed.
It is using an ConnectX Mezzanine card.

 

One blade has been switching between Windows 2008 R2 and CentOS with Xen.
Windows 2008 is running the latest OFED drivers, CentOS is running the 1.5.2
RC2.  That blade is using a ConnectX Mezzanine card.

 

All of the firmware has been updated on the Mezzanine cards, the PCI-E
InfiniHost III EX card, and the switch.  All of the Windows boxes are
configured to use Connected mode.  I have not changed any other settings on
the Linux boxes.

 

As of right now, the network seems stable.  I've been running pings for the
last 12 hours, and nothing has dropped.

 

I did notice in the OpenSM log though some odd entries that I do not believe
belong there.  

 

Jun 30 06:56:26 832438 [B5723B90] 0x02 -> log_notice: Reporting Generic
Notice type:3 num:67 (Mcast group deleted) from LID:6
GID:ff12:1405:ffff::3333:1:2

Jun 30 06:57:53 895990 [B5723B90] 0x02 -> log_notice: Reporting Generic
Notice type:3 num:66 (New mcast group created) from LID:6
GID:ff12:1405:ffff::3333:1:2

Jun 30 07:18:06 770861 [B6124B90] 0x02 -> log_notice: Reporting Generic
Notice type:3 num:67 (Mcast group deleted) from LID:6
GID:ff12:1405:ffff::3333:1:2

Jun 30 07:19:14 835273 [B5723B90] 0x02 -> log_notice: Reporting Generic
Notice type:3 num:66 (New mcast group created) from LID:6
GID:ff12:1405:ffff::3333:1:2

 

 

I would not think that new mcast groups should be created or deleted when
there are no new adapters being added to the network, especially in this
small of a network.  Is it odd to see those messages?

 

Also, I have a warning when I run ibdiagnet - "Suboptimal rate for group.
Lowest member rate: 20Gbps > group-rate: 10gbps"

 

I also have a few things that I'm concerned about in the "PM Counters Info"
section of ibdiagnet as follows : 

 

-W- lid=0x0003 guid=0x003048ffffa12591 dev=47396 Port=1

      Performance Monitor counter     : Value

      symbol_error_counter            : 0xffff (overflow)

-W- lid=0x0004 guid=0x0002c9020029a492 dev=25208 MT25208/P2

      Performance Monitor counter     : Value

      symbol_error_counter            : 0xffff (overflow)

-W- lid=0x0003 guid=0x003048ffffa12591 dev=47396 Port=18

      Performance Monitor counter     : Value

      symbol_error_counter            : 0xffff (overflow)

      port_xmit_constraint_errors     : 0xff (overflow)

-W- lid=0x0003 guid=0x003048ffffa12591 dev=47396 Port=19

      Performance Monitor counter     : Value

      symbol_error_counter            : 0xffff (overflow)

 

I'm not sure if those are bad or not, and if they would point to any sort of
problem, but that's what I'm seeing.

 

Hopefully this gives you a bit more insight into the setup and the issues.
If I can provide anything else that would help debug this issue, please let
me know.

 

From: Richard Croucher [mailto:richard at informatix-sol.com] 
Sent: Wednesday, June 30, 2010 3:12 AM
To: 'Jeff Becker'; 'Matt Breitbach'; ewg at lists.openfabrics.org
Subject: RE: [ewg] Infiniband Interoperability

 

The InfiniBand fabric  knows very little about IpoIB, it is handled by the
host OS stack, however it does need capabilities such as multicast to work
properly for ARP name resolution.

 

The problem you describe sounds similar to a situation I encountered running
multiple, incompatible SM's.

Make sure you only have a single vendor SM.   Whilst the OFED SM build is
fine, I have found many vendors hack their distro's so they either ignore or
always win the SM election.   Explicitly disable all SM's on all
environments you don't want to be running. Don't rely on SM priority across
different implementations .  I'd recommend running openSM on Linux and
disabling all others.

 

Identifying that no SM running is easy, since the ports don't get LIDs,
however when multiple SM's are running it sort of works, since the different
SM's discover which LIDS have been allocated when they scan the fabric.  The
problem I saw was with multicast, each SM had its independent  and different
view of the MC nodes and paths.

 

Sun did their own InfiniBand stack implementation, including SM and is
completely independent of OFED.   I used it a few years ago and  IPoIB
interoperated fine with Linux.

 

You don't say which InfiniBand distro you are running on windows.

 

Linux IPoIB defaults to Connected mode.  This is an optional feature of
IpoIB.  You may want to try setting it to the mandatory datagram mode in
each of your environments. You can disable CM mode in openib.conf on the
linux nodes

 

From: ewg-bounces at lists.openfabrics.org
[mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Jeff Becker
Sent: 29 June 2010 19:26
To: Matt Breitbach; ewg at lists.openfabrics.org
Subject: Re: [ewg] Infiniband Interoperability

 

Hi Matt

On 06/29/10 10:15, Matt Breitbach wrote: 

So I know that this message isn't about starting a new group.  I've actually
tried to join one of the mailing lists but it failed to sign me up.
  
You probably tried to sign up for general at lists.openfabrics.org, which
doesn't really exist anymore (although the archives are still there for
searching). It sounds like you should post your question on
ewg at lists.openfabrics.org. I'll forward it for you.

Jeff Becker
OpenFabrics Server admin



 
I'm working on getting an InfiniBand setup working in a mixed environment of
Windows, Linux, and OpenSolaris.  I'm having huge difficulties getting the
IB network to be stable.  We're working with IPoIB mainly, and get drops
between the Windows and Linux/OpenSolaris systems.
 
We had a professional take a look at our configuration, and he thought that
OpenSM was configured properly, and actually had the network stable for
about 2 days, but it degraded severely after that.  We are now to the point
that most times the OpenSolaris box is unreachable from the Windows systems,
and sometimes from the Linux system (which is running OpenSM).
 
Is there any direction that you would be able to point me in to get some
advice on this or some high-end consulting?  We've invested nearly 3 months
into this project only to have a bladecenter with 4 IB Capable blades and a
SuperMicro 4U server that can't reliably communicate over the InfiniBand
network.
 
-Matt Breitbach
zfsbuild.com (our storage writeup about OpenSolaris ZFS over InfiniBand)
 
 
  






More information about the ewg mailing list