<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns="http://www.w3.org/TR/REC-html40" xmlns:v =
"urn:schemas-microsoft-com:vml" xmlns:o =
"urn:schemas-microsoft-com:office:office" xmlns:w =
"urn:schemas-microsoft-com:office:word" xmlns:m =
"http://schemas.microsoft.com/office/2004/12/omml"><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.2963" name=GENERATOR>
<STYLE>@font-face {
font-family: Cambria Math;
}
@font-face {
font-family: Calibri;
}
@page Section1 {size: 8.5in 11.0in; margin: 1.0in 1.0in 1.0in 1.0in; }
P.MsoNormal {
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman","serif"
}
LI.MsoNormal {
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman","serif"
}
DIV.MsoNormal {
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman","serif"
}
A:link {
COLOR: blue; TEXT-DECORATION: underline; mso-style-priority: 99
}
SPAN.MsoHyperlink {
COLOR: blue; TEXT-DECORATION: underline; mso-style-priority: 99
}
A:visited {
COLOR: purple; TEXT-DECORATION: underline; mso-style-priority: 99
}
SPAN.MsoHyperlinkFollowed {
COLOR: purple; TEXT-DECORATION: underline; mso-style-priority: 99
}
P {
FONT-SIZE: 12pt; MARGIN-LEFT: 0in; MARGIN-RIGHT: 0in; FONT-FAMILY: "Times New Roman","serif"; mso-style-priority: 99; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto
}
SPAN.EmailStyle18 {
COLOR: #1f497d; FONT-FAMILY: "Calibri","sans-serif"; mso-style-type: personal-reply
}
.MsoChpDefault {
FONT-SIZE: 10pt; mso-style-type: export-only
}
DIV.Section1 {
page: Section1
}
</STYLE>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></HEAD>
<BODY lang=EN-US vLink=purple link=blue>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN class=618221714-29012007>see my
comments below.</SPAN></FONT></DIV><BR>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> openib-windows-bounces@openib.org
[mailto:openib-windows-bounces@openib.org] <B>On Behalf Of </B>Fab
Tillier<BR><B>Sent:</B> Thursday, January 25, 2007 7:34 PM<BR><B>To:</B> Yossi
Leybovich; openib-windows@openib.org<BR><B>Subject:</B> Re: [Openib-windows]
[ANNOUNCE] Build 1.0.0.566 posted<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV class=Section1>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">Hi
Yossi,<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">A
question about r538:<o:p></o:p></SPAN></P>
<P><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">------------------------------------------------------------------------<BR>r538
| sleybo | 2006-11-07 08:54:25 +0200 (Tue, 07 Nov 2006) | 3
lines<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">[IBAL] Compliance
tests<BR>1. pass switch_info to the HCA - compliance test C13-026<BR>2. Not
use AL cashe for node_description node_info to force Mkey check -compliance
test
C14-018<BR>------------------------------------------------------------------------<BR><BR></SPAN><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">Have
you tested to see what the effects of removing the cache for node description
and node info are on SM sweeps when the system is busy?<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">I
initially added the cache for these so that the response could be issued in
the context of the CQ callback for the special QP (thus at
DISPATCH_LEVEL). Without the cache processing requires a call to the
local MAD verb, which has to be scheduled on a passive-level thread. If
the system is very busy doing I/O (i.e. lots of small packets in Iometer over
IPoIB), I have seen cases where the local MAD thread does not run fast enough
so the response time for the MAD is too long and the SM declares the node as
having failed and removes it from the fabric. This is pretty nasty, as
suddenly all IB multicast group memberships are lost, but there’s no
indication to the host that things went awry.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">There
are two solutions for this, one is more of a temporary fix than the other
IMO. First, the temporary fix: perform the MKey check in software, so
that the MAD response for as many MADs can be generated at DISPATCH_LEVEL from
the context of the special QP’s CQ callback. This should maintain
compliance while also keeping response times for MADs as short as
possible.<BR><SPAN class=618221714-29012007><FONT color=#0000ff size=2>[Yossi
Leybovich] To solve the problem of denial of service I will add simple
m-key check . In any case of error (or not trivial m-key check (i.e m-key =0)
I will disable the cache and move the MAD to the FW</FONT></SPAN></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><SPAN
class=618221714-29012007><FONT color=#0000ff size=2>( I don't want to count
the m key violation and of course not to add code that generate
traps).</FONT></SPAN></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><SPAN
class=618221714-29012007><FONT color=#0000ff size=2>This will reduce the
handling of good flow packets.</FONT></SPAN></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><SPAN
class=618221714-29012007><FONT color=#0000ff size=2></FONT></SPAN></SPAN><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><SPAN
class=618221714-29012007></SPAN></SPAN><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><BR>The
second solution is to make the local MAD verb asynchronous. The HCA
handles the command asynchronously anyway, so this is a more natural fit given
the HW design. This would mean the local MAD verb would be called
directly from the CQ callback (at DISPATCH_LEVEL), and would return
pending. When the local MAD is processed and the HCA generates the
response to the EQ, the driver could invoke a callback to indicate completion
(again at DISPATCH_LEVEL) which would send out the response. This
solution eliminates the thread scheduling issues associated with handling
local MAD requests in a passive-level thread.<BR><SPAN
class=618221714-29012007><FONT color=#0000ff size=2>[Yossi
Leybovich] </FONT></SPAN></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><SPAN
class=618221714-29012007><FONT color=#0000ff size=2>This will require to test
our driver with async commands ( I think that Leonid does not fully support it
Leonid?) I don't think we will have the time to do that in the short
future</FONT> </SPAN><o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">We
should make sure that systems aren’t susceptible to Denial of Service attacks
from someone flooding them with IPoIB traffic (which gets handled at
DISPATCH_LEVEL in IPoIB’s CQ callback). It’s bad if an application on
one host can cause another host to be removed from the fabric – there will be
no port down events, no notification to the SM when the host is responsive
again, and the host will not be able to participate properly in the fabric
until the next SM sweep.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"> <o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">-Fab</SPAN><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"><o:p></o:p></SPAN></P></DIV></BLOCKQUOTE></BODY></HTML>