[ofa-general] IPoIB, OFED 1.2.5, and multicast groups.

Hal Rosenstock hrosenstock at xsigo.com
Mon Jan 14 09:50:56 PST 2008


On Sun, 2008-01-13 at 10:05 +0200, Eli Cohen wrote:
> IPOIB does not initiate a join to a mulitcast group (except for the
> broadcast group).

IPv6 does indeed do this on an IPoIB interface for solicited node
multicast.

>  This comes from routing protocols or use space
> sockets. Do you run processes that use many different multicast groups?
> 
> 
> On Fri, 2008-01-11 at 19:36 -0800, Ira Weiny wrote:
> > I don't really understand the innerworkings of IPoIB so forgive me if this is a
> > really stupid question but:
> > 
> >    Is it a bug that there is a Multicast group created for every node in our
> >    clusters?
> > 
> > If not a bug why is this done?  We just tried to boot on a 1151 node cluster
> > and opensm is complaining there are not enough multicast groups.
> > 
> >    Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken
> >    Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed
> >    Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken
> >    Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed
> > 
> > 
> > Here is the output from my small test cluster:  (ibnodesinmcast uses saquery a
> > couple of times to print this nice report.)
> > 
> > 
> >    19:17:24 > whatsup
> >    up:   9: wopr[0-7],wopri
> >    down: 0:
> >    root at wopri:/tftpboot/images
> >    19:25:03 > ibnodesinmcast -g
> >    0xC000 (0xff12401bffff0000 : 0x00000000ffffffff)
> >       In  9: wopr[0-7],wopri
> >       Out 0: 0
> >    0xC001 (0xff12401bffff0000 : 0x0000000000000001)
> >       In  9: wopr[0-7],wopri
> >       Out 0: 0
> >    0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed)
> >       In  1: wopr3
> >       Out 8: wopr[0-2,4-7],wopri
> >    0xC003 (0xff12601bffff0000 : 0x0000000000000001)
> >       In  9: wopr[0-7],wopri
> >       Out 0: 0
> >    0xC004 (0xff12601bffff0000 : 0x00000001ff222729)
> >       In  1: wopr4
> >       Out 8: wopr[0-3,5-7],wopri
> >    0xC005 (0xff12601bffff0000 : 0x00000001ff219e65)
> >       In  1: wopri
> >       Out 8: wopr[0-7]
> >    0xC006 (0xff12601bffff0000 : 0x00000001ff00232d)
> >       In  1: wopr6
> >       Out 8: wopr[0-5,7],wopri
> >    0xC007 (0xff12601bffff0000 : 0x00000001ff002325)
> >       In  1: wopr7
> >       Out 8: wopr[0-6],wopri
> >    0xC008 (0xff12601bffff0000 : 0x00000001ff228d35)
> >       In  1: wopr1
> >       Out 8: wopr[0,2-7],wopri
> >    0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1)
> >       In  1: wopr2
> >       Out 8: wopr[0-1,3-7],wopri
> >    0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1)
> >       In  1: wopr0
> >       Out 8: wopr[1-7],wopri
> >    0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9)
> >       In  1: wopr5
> >       Out 8: wopr[0-4,6-7],wopri
> > 
> > 
> > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in
> > them and represent an ipv6 address.  Could you turn off ipv6 with the latest
> > IPoIB?
> > 
> > In a bind,
> > Ira
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > 
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list