[ofa-general] opensm: bad multicast forwarding table entries

akepner at sgi.com akepner at sgi.com
Wed Nov 12 15:54:31 PST 2008


On Wed, Nov 12, 2008 at 06:27:29PM -0500, Hal Rosenstock wrote:

Thanks for having a look at this, Hal.

> On Wed, Nov 12, 2008 at 5:18 PM,  <akepner at sgi.com> wrote:
> > .....
> > -I- Multicast Group:0xC069 has:2 switches and:2 HCAs
> > -E- Disconnected switch:S0800690000002e51/U1 in group:0xC069
> > -E- Disconnected HCA:r4i2n10/U1
> 
> Is it really an error to have a multicast group like this ? 

Well, 'ibidagnet -r' reports it as an error. 

> ... I agree
> it's not needed to route if there's only 1 member port.
> 
> Can you describe the scenario under which this occurs ? Are things
> steady state or are there changes going on in the subnet ? Any errors
> in the opensm log ?

As far as I know, this is steady state behavior. I'll check about 
opensm logging any errors.

> ..... 
> > So far, so good. But we also have r4i2n10, connected to the switch with
> > lid 1533 port 7:
> >
> > switchguid=0x800690000002e50(800690000002e50)
> > Switch  24 "S-0800690000002e50"         # "MT47396 Infiniscale-III Mellanox Technologies" base port 0 lid 1533 lmc 0
> > ......
> > [7]     "H-003048c2438a0000"[1](3048c2438a0001)                 # "r4i2n10 HCA-1" lid 771 4xDDR
> >
> > with this mft entry:
> >
> > Multicast mlids [0xc000-0xc3ff] of switch Lid 1533 guid 0x0800690000002e50 (MT47396 Infiniscale-III Mellanox Technologies):
> >            0                   1                   2
> >     Ports: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4
> >  MLid
> > .....
> > 0xc069                    x
> >
> > Any idea why "r4i2n10", with PortGid fe80::3048c2438a0001 would have a
> > mft entry for the multicast group with MGID ff12601bffff::1ff26d289?
> 
> The MFT entry is based on an MLID and not the MGID. What does saquery
> -g show ? Does it show one or more than one MGID with an MLID of
> 0xc069 ? 

Will also try to get this information.

> Also, does saquery -m 0xc069 show one member ?

Yes, only one member.

> 
> I don't think OpenSM does this but if the multicast groups are
> disjoint, the same MLID could be used for two different groups (MGIDs)
> in different parts of the subnet.
> 

Oh, that'd be confusing.

> Sasha is probably best to comment on what has changed in this area. Is
> it possible to try this with the latest OpenSM to see if this has been
> fixed ?
> 

I doubt that this alone would be important enough to get the 
customer to try upgrading opensm, but I can let them know it's 
an option - especially if there's good reason to think it'd 
help.

-- 
Arthur




More information about the general mailing list