[ofa-general] opensm: bad multicast forwarding table entries
akepner at sgi.com
akepner at sgi.com
Wed Nov 12 15:54:31 PST 2008
On Wed, Nov 12, 2008 at 06:27:29PM -0500, Hal Rosenstock wrote:
Thanks for having a look at this, Hal.
> On Wed, Nov 12, 2008 at 5:18 PM, <akepner at sgi.com> wrote:
> > .....
> > -I- Multicast Group:0xC069 has:2 switches and:2 HCAs
> > -E- Disconnected switch:S0800690000002e51/U1 in group:0xC069
> > -E- Disconnected HCA:r4i2n10/U1
>
> Is it really an error to have a multicast group like this ?
Well, 'ibidagnet -r' reports it as an error.
> ... I agree
> it's not needed to route if there's only 1 member port.
>
> Can you describe the scenario under which this occurs ? Are things
> steady state or are there changes going on in the subnet ? Any errors
> in the opensm log ?
As far as I know, this is steady state behavior. I'll check about
opensm logging any errors.
> .....
> > So far, so good. But we also have r4i2n10, connected to the switch with
> > lid 1533 port 7:
> >
> > switchguid=0x800690000002e50(800690000002e50)
> > Switch 24 "S-0800690000002e50" # "MT47396 Infiniscale-III Mellanox Technologies" base port 0 lid 1533 lmc 0
> > ......
> > [7] "H-003048c2438a0000"[1](3048c2438a0001) # "r4i2n10 HCA-1" lid 771 4xDDR
> >
> > with this mft entry:
> >
> > Multicast mlids [0xc000-0xc3ff] of switch Lid 1533 guid 0x0800690000002e50 (MT47396 Infiniscale-III Mellanox Technologies):
> > 0 1 2
> > Ports: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4
> > MLid
> > .....
> > 0xc069 x
> >
> > Any idea why "r4i2n10", with PortGid fe80::3048c2438a0001 would have a
> > mft entry for the multicast group with MGID ff12601bffff::1ff26d289?
>
> The MFT entry is based on an MLID and not the MGID. What does saquery
> -g show ? Does it show one or more than one MGID with an MLID of
> 0xc069 ?
Will also try to get this information.
> Also, does saquery -m 0xc069 show one member ?
Yes, only one member.
>
> I don't think OpenSM does this but if the multicast groups are
> disjoint, the same MLID could be used for two different groups (MGIDs)
> in different parts of the subnet.
>
Oh, that'd be confusing.
> Sasha is probably best to comment on what has changed in this area. Is
> it possible to try this with the latest OpenSM to see if this has been
> fixed ?
>
I doubt that this alone would be important enough to get the
customer to try upgrading opensm, but I can let them know it's
an option - especially if there's good reason to think it'd
help.
--
Arthur
More information about the general
mailing list