[ofa-general] Re: openSM: Different IB MTUs

Hal Rosenstock hal.rosenstock at gmail.com
Thu Jul 26 05:58:48 PDT 2007


On 7/26/07, Eitan Zahavi <eitan at mellanox.co.il> wrote:
>
>  *I propose that when there is no MTU in the partition policy file
> OpenSM use a *
> *configurable default from: **/etc/cache/opensm/opensm.opt.*
>

That would make this the default rather than 2K. IMO it should be when some
"special" unused mtu is set in the partition config.

-- Hal

 *Something like:*
> *# The default MTU to be used for IPoIB and other MCGs when the
> partition-policy *
> *# does not provide exact value. The default is the lowest possible MTU*
> *mcg_default_mtu 1*
> **
> *Eitan Zahavi***
> Senior Engineering Director, Software Architect
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
>
>
>  ------------------------------
> *From:* Shirley Ma [mailto:xma at us.ibm.com]
> *Sent:* Wednesday, July 25, 2007 10:45 PM
> *To:* Eitan Zahavi
> *Cc:* general at lists.openfabrics.org; Hal Rosenstock
> *Subject:* RE: [ofa-general] Re: openSM: Different IB MTUs
>
>
>
> Hello Eitan, Hal,
>
> Thanks. It's good openSM has the configuration option to set up these
> attributes in MC. Is this a good idea to add below to openSM: When there is
> no MTU defined in the configuration file, SM can pick up the smallest link
> MTU in the fabrics by default? MTU is unlikely rate, slower rate might
> indicate the cablling problem. So using the smallest link MTU in the fabrics
> might not be a bad choice for MC by default. The reason I request here is to
> create IP multicast group, MTU is not an attribute of the group. When
> mapping IP multicast to IB multicast, IB muliticast might fail because of
> different IB link MTU size in the group, but IP multicast group will be
> successful without knowing the failure. If admin sets MTU in configuration
> file, admin would know this failure. Otherwise, admin/users could spend too
> much time on debugging their broken multicasting applications.
>
> Thanks
> Shirley Ma
>
> [image: Inactive hide details for "Eitan Zahavi" <eitan at mellanox.co.il>]"Eitan
> Zahavi" <eitan at mellanox.co.il>
>
>
>
>     *"Eitan Zahavi" <eitan at mellanox.co.il>*
>
>             07/25/07 12:25 PM
>
>
> To
>
> "Hal Rosenstock" <hal.rosenstock at gmail.com>, Shirley
> Ma/Beaverton/IBM at IBMUS
> cc
>
> <general at lists.openfabrics.org>
> Subject
>
> RE: [ofa-general] Re: openSM: Different IB MTUs
> *Hi Shirley,*
>
> *I think I understand where your question comes from...*
> *Many have issue with heterogonous fabrics where not all nodes have same
> MTU or Speed.*
> *Especially when IPoIB relies on all nodes joining the broadcast group.*
>
> *The term "join" for multicast groups is a little overloaded.*
> *If a node joins an existing MC group it has to have a rate (speed *
> width) > MCG.rate and support MTU > MCG.MTU otherwise it is denied.*
> *If the join is actually a "create" the node has to provide the rate and
> MTU which define the MCG values.*
>
> *To allow for administrator to control the IPoIB MCGs MTU and rate OpenSM
> provides the means to control these*
> *values per partition. See the doc/partition-config.doc*
> *Still the administrator should know what would be the lowest MTU and rate
> the nodes expected to join the IPoIB subnet have.*
> *The tradeoff is in the hands of the administrator who can set a value
> that will prevent slow nodes from joining the group, *
> *or assign a low value that will fit all nodes but slow down communication
> ...*
>
> *EZ*
>
> *Eitan Zahavi*
> Senior Engineering Director, Software Architect
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
>
>
>
> ------------------------------
> *From:* general-bounces at lists.openfabrics.org [
> mailto:general-bounces at lists.openfabrics.org<general-bounces at lists.openfabrics.org>]
> *On Behalf Of *Hal Rosenstock*
> Sent:* Wednesday, July 25, 2007 10:01 PM*
> To:* Shirley Ma*
> Cc:* general at lists.openfabrics.org*
> Subject:* [ofa-general] Re: openSM: Different IB MTUs
>
> Shirley,
>
> On 7/25/07, *Shirley Ma* <*xma at us.ibm.com* <xma at us.ibm.com>> wrote:
>
>    Hal,
>
>    Thanks for your prompt reply. I am asking for how openSM handle
>    different link MTUs in SA MCMemberRecord MTU. For example, if we have some
>    links MTU as 2K, some links MTU as 1K. Then when enabling IPoIB, how does SM
>    decide IPoIB broadcast group MCMemberRecord MTU size? When creating an IB
>    multicast group from a 2K MTU node first, which PMTU value is attaching to
>    this IB multicast group MCMemberRecord MTU?
>
>
>
> MCMemberRecord MTU gets the group MTU (when created). This is either this
> first joiner with sufficient components or preconfigured (and MTU can be set
> in the config). If a joiner has insufficient MTU for the group, it is
> denied.
>
> -- Hal
>
>
>    Thanks
>    Shirley Ma
>
>    [image: Inactive hide details for "Hal Rosenstock"
>    <hal.rosenstock at gmail.com>]"Hal Rosenstock" < *
>    hal.rosenstock at gmail.com* <hal.rosenstock at gmail.com>>
>
>          *"Hal Rosenstock" <**hal.rosenstock at gmail.com*<hal.rosenstock at gmail.com>
>                            *>*
>
>                            07/25/07 10:57 AM
>                               To
>
>    Shirley Ma/Beaverton/IBM at IBMUS  cc
>    *
>    **general at lists.openfabrics.org* <general at lists.openfabrics.org>
>    Subject
>
>    Re: openSM: Different IB MTUs
>    Shirley,
>
>    On 7/25/07, *Shirley Ma* <* **xma at us.ibm.com* <xma at us.ibm.com>>
>    wrote:
>       Hello Hal,
>
>          How does openSM handle CAs with different MTUs in the
>          same subnet? For example, IPoIB broadcast group MTU, IB multicast group
>          PMTU? Does openSM pick up the smallest MTU in the subnet?
>
>
>    Are you asking about link MTU, SA PathRecord/MultiPathRecord MTU, SA
>    MCMemberRecord MTU, or all of these ?
>
>    -- Hal
>       Thanks
>          Shirley Ma
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070726/8dc07d3c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070726/8dc07d3c/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0E407396.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070726/8dc07d3c/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0E830176.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070726/8dc07d3c/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070726/8dc07d3c/attachment-0003.gif>


More information about the general mailing list