<html><body>
<p>Eitan,<br>
<br>
That's a good approach to address the issue.<br>
<br>
thanks<br>
Shirley Ma<br>
<br>
<img width="16" height="16" src="cid:1__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt="Inactive hide details for "Eitan Zahavi" <eitan@mellanox.co.il>">"Eitan Zahavi" <eitan@mellanox.co.il><br>
<br>
<br>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td style="background-image:url(cid:2__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com); background-repeat: no-repeat; " width="40%">
<ul>
<ul>
<ul>
<ul><b><font size="2">"Eitan Zahavi" <eitan@mellanox.co.il></font></b><font size="2"> </font>
<p><font size="2">07/25/07 11:00 PM</font></ul>
</ul>
</ul>
</ul>
</td><td width="60%">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="1%"><img width="58" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<div align="right"><font size="2">To</font></div></td><td width="100%"><img width="1" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="2">Shirley Ma/Beaverton/IBM@IBMUS</font></td></tr>
<tr valign="top"><td width="1%"><img width="58" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<div align="right"><font size="2">cc</font></div></td><td width="100%"><img width="1" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="2"><general@lists.openfabrics.org>, "Hal Rosenstock" <hal.rosenstock@gmail.com></font></td></tr>
<tr valign="top"><td width="1%"><img width="58" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<div align="right"><font size="2">Subject</font></div></td><td width="100%"><img width="1" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="2">RE: [ofa-general] Re: openSM: Different IB MTUs</font></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="58"><img width="1" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""></td><td width="336"><img width="1" height="1" src="cid:3__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" border="0" alt=""></td></tr>
</table>
</td></tr>
</table>
<br>
<b><font size="4" color="#0000FF">I propose that when there is no MTU in the partition policy file OpenSM use a </font></b><br>
<b><font size="4" color="#0000FF">configurable default from: /etc/cache/opensm/opensm.opt.</font></b><br>
<b><font size="4" color="#0000FF">Something like:</font></b><br>
<b><font size="4" color="#0000FF"># The default MTU to be used for IPoIB and other MCGs when the partition-policy </font></b><br>
<b><font size="4" color="#0000FF"># does not provide exact value. The default is the lowest possible MTU</font></b><br>
<b><font size="4" color="#0000FF">mcg_default_mtu 1</font></b><br>
<font size="4"> </font><br>
<b><i><font size="7" color="#0000FF">Eitan Zahavi</font></i></b><font size="4"> </font><br>
Senior Engineering Director, Software Architect<font size="4"> </font><br>
Mellanox Technologies LTD<font size="4"> </font><br>
Tel:+972-4-9097208<br>
Fax:+972-4-9593245<font size="4"> </font><br>
P.O. Box 586 Yokneam 20692 ISRAEL<font size="4"> </font><br>
<font size="4"> </font><br>
<br>
<hr width="100%" size="2" align="left"><b>From:</b> Shirley Ma [<a href="mailto:xma@us.ibm.com">mailto:xma@us.ibm.com</a>] <b><br>
Sent:</b> Wednesday, July 25, 2007 10:45 PM<b><br>
To:</b> Eitan Zahavi<b><br>
Cc:</b> general@lists.openfabrics.org; Hal Rosenstock<b><br>
Subject:</b> RE: [ofa-general] Re: openSM: Different IB MTUs<font size="4"><br>
</font>
<p><font size="4">Hello Eitan, Hal,<br>
<br>
Thanks. It's good openSM has the configuration option to set up these attributes in MC. Is this a good idea to add below to openSM: When there is no MTU defined in the configuration file, SM can pick up the smallest link MTU in the fabrics by default? MTU is unlikely rate, slower rate might indicate the cablling problem. So using the smallest link MTU in the fabrics might not be a bad choice for MC by default. The reason I request here is to create IP multicast group, MTU is not an attribute of the group. When mapping IP multicast to IB multicast, IB muliticast might fail because of different IB link MTU size in the group, but IP multicast group will be successful without knowing the failure. If admin sets MTU in configuration file, admin would know this failure. Otherwise, admin/users could spend too much time on debugging their broken multicasting applications.<br>
<br>
Thanks<br>
Shirley Ma<br>
<br>
</font><img src="cid:4__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="16" height="16" alt="Inactive hide details for "Eitan Zahavi" <eitan@mellanox.co.il>"><font size="4">"Eitan Zahavi" <eitan@mellanox.co.il><br>
<br>
</font>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="45%">
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul><b>"Eitan Zahavi" <eitan@mellanox.co.il></b>
<p>07/25/07 12:25 PM</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</td><td width="55%">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="12%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="58" height="1"><div align="right">To</div></td><td width="88%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"><br>
"Hal Rosenstock" <hal.rosenstock@gmail.com>, Shirley Ma/Beaverton/IBM@IBMUS</td></tr>
<tr valign="top"><td width="12%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="58" height="1"><div align="right">cc</div></td><td width="88%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"><br>
<general@lists.openfabrics.org></td></tr>
<tr valign="top"><td width="12%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="58" height="1"><div align="right">Subject</div></td><td width="88%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"><br>
RE: [ofa-general] Re: openSM: Different IB MTUs</td></tr>
</table>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="15%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"></td><td width="85%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"></td></tr>
</table>
</td></tr>
</table>
<b><font size="5" color="#0000FF"><br>
Hi Shirley,</font></b><font size="4"><br>
</font><b><font size="5" color="#0000FF"><br>
I think I understand where your question comes from...<br>
Many have issue with heterogonous fabrics where not all nodes have same MTU or Speed.<br>
Especially when IPoIB relies on all nodes joining the broadcast group.</font></b><font size="4"><br>
</font><b><font size="5" color="#0000FF"><br>
The term "join" for multicast groups is a little overloaded.<br>
If a node joins an existing MC group it has to have a rate (speed * width) > MCG.rate and support MTU > MCG.MTU otherwise it is denied.<br>
If the join is actually a "create" the node has to provide the rate and MTU which define the MCG values.</font></b><font size="4"><br>
</font><b><font size="5" color="#0000FF"><br>
To allow for administrator to control the IPoIB MCGs MTU and rate OpenSM provides the means to control these<br>
values per partition. See the doc/partition-config.doc</font></b><font size="5"> </font><b><font size="5" color="#0000FF"><br>
Still the administrator should know what would be the lowest MTU and rate the nodes expected to join the IPoIB subnet have.<br>
The tradeoff is in the hands of the administrator who can set a value that will prevent slow nodes from joining the group, <br>
or assign a low value that will fit all nodes but slow down communication ...</font></b><font size="4"><br>
</font><b><font size="5" color="#0000FF"><br>
EZ</font></b><font size="4"> </font>
<p><b><i><font size="7" color="#0000FF">Eitan Zahavi</font></i></b><font size="5"> </font><font size="4"><br>
Senior Engineering Director, Software Architect</font><font size="5"> </font><font size="4"><br>
Mellanox Technologies LTD</font><font size="5"> </font><font size="4"><br>
Tel:+972-4-9097208<br>
Fax:+972-4-9593245</font><font size="5"> </font><font size="4"><br>
P.O. Box 586 Yokneam 20692 ISRAEL</font><font size="5"> </font>
<p><font size="4"><br>
<br>
</font><hr width="100%" size="2" align="left"><b><font size="4">From:</font></b><font size="4"> general-bounces@lists.openfabrics.org [</font><a href="mailto:general-bounces@lists.openfabrics.org"><u><font size="4" color="#0000FF">mailto:general-bounces@lists.openfabrics.org</font></u></a><font size="4">] </font><b><font size="4">On Behalf Of </font></b><font size="4">Hal Rosenstock</font><b><font size="4"><br>
Sent:</font></b><font size="4"> Wednesday, July 25, 2007 10:01 PM</font><b><font size="4"><br>
To:</font></b><font size="4"> Shirley Ma</font><b><font size="4"><br>
Cc:</font></b><font size="4"> general@lists.openfabrics.org</font><b><font size="4"><br>
Subject:</font></b><font size="4"> [ofa-general] Re: openSM: Different IB MTUs<br>
</font><font size="5"><br>
Shirley,</font><font size="4"><br>
</font><font size="5"><br>
On 7/25/07, </font><b><font size="5">Shirley Ma</font></b><font size="5"> <</font><a href="mailto:xma@us.ibm.com"><u><font size="5" color="#0000FF">xma@us.ibm.com</font></u></a><font size="5">> wrote: </font>
<ul>
<ul><font size="5">Hal,<br>
<br>
Thanks for your prompt reply. I am asking for how openSM handle different link MTUs in SA MCMemberRecord MTU. For example, if we have some links MTU as 2K, some links MTU as 1K. Then when enabling IPoIB, how does SM decide IPoIB broadcast group MCMemberRecord MTU size? When creating an IB multicast group from a 2K MTU node first, which PMTU value is attaching to this IB multicast group MCMemberRecord MTU? </font></ul>
</ul>
<font size="4"><br>
</font><font size="5"><br>
MCMemberRecord MTU gets the group MTU (when created). This is either this first joiner with sufficient components or preconfigured (and MTU can be set in the config). If a joiner has insufficient MTU for the group, it is denied. </font><font size="4"><br>
</font><font size="5"><br>
-- Hal</font><font size="4"><br>
</font>
<ul>
<ul><font size="5">Thanks<br>
Shirley Ma<br>
</font><font size="4"><br>
</font><img src="cid:4__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="16" height="16" alt="Inactive hide details for "Hal Rosenstock" <hal.rosenstock@gmail.com>"><font size="5">"Hal Rosenstock" < </font><a href="mailto:hal.rosenstock@gmail.com" target="_blank"><u><font size="5" color="#0000FF">hal.rosenstock@gmail.com</font></u></a><font size="5">><br>
</font>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="69%">
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul>
<ul><b><font size="4">"Hal Rosenstock" <</font></b><a href="mailto:hal.rosenstock@gmail.com" target="_blank"><b><u><font size="4" color="#0000FF">hal.rosenstock@gmail.com</font></u></b></a><b><font size="4">></font></b><font size="4"> </font>
<p><font size="4">07/25/07 10:57 AM</font></ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</ul>
</td><td width="31%">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="22%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="58" height="1"><div align="right"><font size="4">To</font></div></td><td width="78%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"><font size="4"><br>
Shirley Ma/Beaverton/IBM@IBMUS</font></td></tr>
<tr valign="top"><td width="22%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="58" height="1"><div align="right"><font size="4">cc</font></div></td><td width="78%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"><u><font size="4" color="#0000FF"><br>
</font></u><a href="mailto:general@lists.openfabrics.org" target="_blank"><u><font size="4" color="#0000FF">general@lists.openfabrics.org</font></u></a></td></tr>
<tr valign="top"><td width="22%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="58" height="1"><div align="right"><font size="4">Subject</font></div></td><td width="78%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"><font size="4"><br>
Re: openSM: Different IB MTUs</font></td></tr>
</table>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="50%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"></td><td width="50%"><img src="cid:5__=08BBF9B7DFB22F0A8f9e8a93df938@us.ibm.com" width="1" height="1"></td></tr>
</table>
</td></tr>
</table>
<font size="6"><br>
Shirley,<br>
<br>
On 7/25/07, </font><b><font size="6">Shirley Ma</font></b><font size="6"> <</font><a href="mailto:xma@us.ibm.com" target="_blank"><u><font size="5" color="#0000FF"> </font></u><u><font size="6" color="#0000FF">xma@us.ibm.com</font></u></a><font size="6">> wrote: </font>
<ul>
<ul>
<ul>
<ul><font size="6">Hello Hal,<br>
<br>
How does openSM handle CAs with different MTUs in the same subnet? For example, IPoIB broadcast group MTU, IB multicast group PMTU? Does openSM pick up the smallest MTU in the subnet? </font></ul>
</ul>
</ul>
</ul>
<font size="6"><br>
<br>
Are you asking about link MTU, SA PathRecord/MultiPathRecord MTU, SA MCMemberRecord MTU, or all of these ?<br>
<br>
-- Hal </font>
<ul>
<ul>
<ul>
<ul><font size="6">Thanks<br>
Shirley Ma</font></ul>
</ul>
</ul>
</ul>
</ul>
</ul>
<br>
</body></html>