[openib-general] IPv6 and IPoIB scalability issue

Todd Rimmer todd.rimmer at qlogic.com
Thu Nov 30 14:26:14 PST 2006


There is a potential limitation in IB which can affect IPv6 operation on
larger clusters.

IPV6 defines that each node will have a Solicited Node Multicast
address.  This address is unique per node and is constructed from the
IPV6 unicast address of the node.  (see RFC 2373 for more details).

IP over IB defines that IPV6 multicast addresses map to IB multicast
GIDs in a one to one manner.

IB defines a multicast address space limit of 4095 LIDs.

Popular IB switches have a smaller limit, typically 1024 multicast
forwarding table entries.

This means that if IPv6 is enabled, a cluster of ~1024 nodes may run out
of multicast entries in the switches and may encounter problems when
running IPv6 traffic. (For example if the Solicited Node Multicast
addresses consume all the multicast switch entries, other application
specific multicast groups or worse yet permanent multicast addresses
could fail to be created).

Proposed solution:
- add an IPoIB configuration parameter.  This parameter could redirect
the Solicited Node Multicast traffic to the IPv6 All Nodes multicast
address (IB GID 0xff01601B.....0000001)
- on clusters near 1024 nodes (there are a few other standard multicast
addresses not to mention application specific multicast for IPv4 and
IPv6) this parameter would be required to be enabled.

While this means that Solicited Node Multicast becomes a much larger
scope multicast, it will greatly reduce the Multicast member record
stress on the SA and switches.  In general the IP stack will filter
undesired inbound multicast packets, so applications will continue to
function properly.

While not ideal, this approach solves the problem easily.  Alternatives
based on multicast member record join results or joining/creating the
group only if really needed are possible but can be more complex and
have subtle complications and issues.

Being configurable would allow clusters where efficiency for Solicited
Node Multicast was important to choose the appropriate trade-off in
operation.

Thoughts?

Todd Rimmer
Chief Architect         System Interconnect Group, QLogic
Voice: 610-233-4852     Fax: 610-233-4777
Todd.Rimmer at QLogic.com  www.QLogic.com
 

> -----Original Message-----
> From: openib-general-bounces at openib.org [mailto:openib-general-
> bounces at openib.org] On Behalf Of Sean Hubbell
> Sent: Wednesday, November 29, 2006 2:11 PM
> To: Eitan Zahavi
> Cc: openib-general at openib.org
> Subject: Re: [openib-general] SDP Protocol Issue Was: Configuration of
sdp
> 
> Sounds like the problem I am having as I downloaded the lastest and
> tried but still get the error 97 which is protocol family. I'll
upgrade
> the latest ib modules, but to do that I have to upgrade my kernel
first...
> 
> Thanks...
> 
> Sean
> 
> Eitan Zahavi wrote:
> > Hi Sean,
> >
> > Now I remember that if you go back enough with SDP (I think MST
should
> > know how far) the Address Family in the address given to SDP used to
be
> > required to be of type AF_INET_SDP. The new libsdp code does not do
that
> > and works with the new SDP requirement to only create the socket
with
> > AF_INET_SDP and later use AF_INET for the address (used for
> > bind/connect).
> >
> > So you should try and move to newest SDP too.
> >
> > Eitan Zahavi
> > Senior Engineering Director, Software Architect
> > Mellanox Technologies LTD
> > Tel:+972-4-9097208
> > Fax:+972-4-9593245
> > P.O. Box 586 Yokneam 20692 ISRAEL
> >
> >
> >
> >> -----Original Message-----
> >> From: Sean Hubbell [mailto:shubbell at dbresearch.net]
> >> Sent: Wednesday, November 29, 2006 7:53 PM
> >> To: Eitan Zahavi; openib-general at openib.org
> >> Subject: Re: [openib-general] SDP Protocol Issue Was: Configuration
of
> >>
> > sdp
> >
> >> Eitan Zahavi wrote:
> >>
> >>> Sean Hubbell wrote:
> >>>
> >>>> Ok, after futher inverstigation, here is what I have found:
> >>>>
> >>>> socket failed: Address family not supported by protocol
> >>>>
> >>>>
> >>> Do yo u have sdp loaded. Please try:
> >>> lsmod | grep ib_sdp
> >>>
> >> Yes.
> >>
> >>>> I am assuming when sdp "exchanges" the AF_INET sockets to AF_SDP
> >>>> sockets that the current code that I am running has a issue or I
> >>>> don't know how to configure things...
> >>>>
> >>>> I have running libsdp v. 0.9.0 x86_64 as the rpm from downloaded
> >>>>
> > from
> >
> >>>> mirror.centos.org running kernel 2.6.9-42.0.3.plus.c4smp
> >>>>
> >>>>
> >>>>
> >>> I would recommend upgrading libsdp only to the one available from
> >>>
> > the
> >
> >>> git tree at git clone
> >>> git://staging.openfabrics.org/git/~eitan/libsdp.git
> >>>
> >>> or still fom SVN:
> >>> https://openib.org/svn/gen2/trunk/src/userspace/libsdp
> >>>
> >>>
> >>>> Also, my /etc/libsdp.conf file has the following:
> >>>>
> >>>> log min-level 1 destination syslog
> >>>> #use both server * *:*
> >>>> #use both client * *:*
> >>>> use sdp client * 10.10.0.0/16:*
> >>>> use sdp server * 10.10.0.0/16:*
> >>>>
> >>>> and seems to break all of my TCP connections, regardless if the
> >>>> interface is 10.10.* Is my configuration correct?
> >>>>
> >>>>
> >>> The "both" would have been better in preserving TCP based services
> >>>
> > as
> >
> >>> it falls back to use TCP if SDP does not work.
> >>>
> >> I tried that originally and received the same results (things did
not
> >>
> > fall back to
> >
> >> using tcp), so I attempted to isolate this to my infiniband
> >>
> > connection... I'll
> >
> >> change it to both.
> >>
> >>
> >>>> What would you recommend my next step to be? Upgrading my kernel
as
> >>>> well as libsdp to the latest? Also, currently without using sdp I
> >>>>
> > am
> >
> >>>> getting 941 MBps, what do you get when you run using sdp, Karun
> >>>> mentioned that he is getting double the performance, which is
where
> >>>>
> > I
> >
> >>>> am at; needing better performance from my apps?
> >>>>
> >>>>
> >>> Depends on your apps. If you have many connections of large
messages
> >>> SDP is a clear win.
> >>>
> >> Yes, so it looks like this would be a valid upgrade. I'll try the
git
> >>
> > tree.
> >
> >> Thanks Eitan,
> >>
> >> Sean
> >>
> >
> >
> >
> 
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-
> general





More information about the general mailing list