[openib-general] umad abi 2 v 3 and multicast join failed

Hal Rosenstock halr at voltaire.com
Wed May 25 20:56:38 PDT 2005


On Wed, 2005-05-25 at 19:06, Troy Benjegerdes wrote:
> I was running a crufty version of opensm (compiled from the
> roland-uverbs branch),

roland-uverbs is an orphaned branch at this point.

You should not be using this version. It is not supported and out of
date. You should use the one on the trunk. See below.

>  and I started getting these kinds of errors for
> no apparent reason:
> 
> ib0: multicast join failed for ff12:401b:ffff:0:0:0:ffff:ffff, status
> -22
> ib0: multicast join failed for ff12:401b:ffff:0:0:0:ffff:ffff, status
> -22
> 
> I'm running 2.6.11 kernels, and 'stock' modules.. I just tried
> rebuilding opensm from the latest SVN, but it apparently needs a new
> umad driver..
> 
> warn: [24878] umad_init: wrong ABI version:
> /sys/class/infiniband_mad/abi_version is 2 but library ABI is 3

Right. This is old OpenSM (actually old libibumad) with the latest from
OpenIB svn (past where I put the changes to support send side RMPP in).

Note that I did say the following:
user_mad: Support RMPP on send side

Note that this change will need a coordinated change to OpenSM and some
userspace/management libraries which will be done as soon as possible
once this patch is accepted.

It was followed by a patch to userspace/management which includes OpenSM
for this:
userspace/management changes to support send side RMPP
(needs change to linux-kernel/infiniband/core/user_mad.c)
ABI_VERSION is now 3
RMPP is enabled in build
SA GetTable is now supported properly (within current RMPP limitations)

> I suppose I need to rebuild the kernel ib_umad (and maybe everything
> else for good measure)..

No. It's the other way around. You need to rebuild OpenSM.

>  And if I do that, should I expect OpenSM to
> work better regarding the multicast issue?
> 
> Also, what will happen if I run opensm on two different nodes? Will they
> fight, or will one of them figure out how to be a backup slave SM if the
> first goes down?

SM mastership should work. You should be able to run any number of
OpenSMs in a subnet and one of them will become master. [This is a
separate issue from the ABI version change.]

-- Hal




More information about the general mailing list