<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:p="urn:schemas-microsoft-com:office:powerpoint" xmlns:a="urn:schemas-microsoft-com:office:access" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" xmlns:b="urn:schemas-microsoft-com:office:publisher" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet" xmlns:oa="urn:schemas-microsoft-com:office:activation" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:D="DAV:" xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml" xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp" xmlns:udc="http://schemas.microsoft.com/data/udc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile" xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Arial","sans-serif";
color:navy;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><span style='color:#1F497D'>Hi Jan,<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>The problem you’re facing
is independent of the interface model and driver model (filter driver vs. root
enumerated). What you need is for the bus driver to create PDOs based on
information in the registry. The point I was trying to make was that a
kernel DLL model makes driver updates very cumbersome, and that the existing
PnP interface model works quite. Neither a PnP interface or a kernel DLL
entry point solves the problem of putting an arbitrary device in the HCA’s
device tree.<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Note also that your example of
IPoIB isn’t quite right, as there is a local physical hardware entity represented
by the IPoIB PDO – the HCA’s port. What you want to do is
create a PDO for a device connected to the IB fabric without any information
about the fabric topology or even connectivity to it – something that
seems distinctly non-PnP. That said, I agree that for your case extending
the bus driver to enumerate devices from the registry makes sense, and it would
be interesting to find out how many other kernel drivers need this
functionality too.<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>For power management, IBAL
treats a power down event as an HCA removal. This was done because the
HCA driver did not handle power management. Now that it does, the HCA
should really remain registered with IBAL, and the HCA driver should handle requests
coming in when a device is powered down. Treating HCA power transitions
as device removals was a necessary hack, but is still a hack and a nicer
solution would be great.<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Keep in mind too that a filter
driver model does not preclude the IBAL driver from creating a single,
well-know-named device from its DriverEntry routine or when the first HCA is
added. There is a problem with the filter driver in that the IBAL driver
doesn’t get loaded unless there’s an HCA enabled in the system, but
this could probably be worked around using a service that loads the driver (but
this probably introduces more driver update difficulties, I don’t know
for sure).<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>The bottom line, though, is that
IBAL wasn’t really designed for Windows. It was made to work on
Windows, but that’s far from the same thing. I think it would be
worthwhile to think about what things are good and what things are bad in the
existing driver model, and then see what the best path for addressing them
would be. Treating issues one at a time doesn’t necessarily get to
the desired end result as fast as evaluating the problem as a whole.<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>-Fab<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<div>
<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'>
<p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span
style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Jan Bottorff
[mailto:jbottorff@xsigo.com] <br>
<b>Sent:</b> Monday, May 14, 2007 4:41 PM<br>
<b>To:</b> Fab Tillier; Yossi Leybovich; ofw@lists.openfabrics.org<br>
<b>Subject:</b> RE: [ofw] Loading drivers on LHS<o:p></o:p></span></p>
</div>
</div>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><span style='color:navy'>></span><span style='color:#1F497D'>For
kernel clients, the existing mechanism of querying for the IBAL interface via
IRP_MN_ QUERY_INTERFACE requests </span><span style='color:navy'><o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>></span><span style='color:#1F497D'>should not be changed.
This is the proper way for kernel drivers to </span><span style='color:navy'> </span><span
style='color:#1F497D'>get one another’s interfaces, and there is support</span><span
style='color:navy'><o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>></span><span style='color:#1F497D'>for PnP notifications so
that drivers can be unloaded and updated. Even querying the CI interface
is fine.</span><span style='color:navy'><o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>Me and my team have a bunch of experience being a root enumerated
kernel client of the IBAL interface and there is a real problem when it comes
to power management. A peer driver that opens the IBAL interface (and probably
registers for device interface notification to know when the IBAL device
exists) is not in the PnP hierarchy of the hca, and as a result there is no
power relationship between the hca driver and the peer driver. This means when
the system tries to standby or hibernate (which the IB stack/hca now supports),
the hca stack may get powered down before the kernel client. As a result, on
sleep events, a kernel client has no guarantee fabric communication still works
when it’s processing device power irps. The client can’t cleanly shut
down QP’s and the entity at the remote end of the QP, which is a problem.
<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>Our belief is also since our client driver doesn’t really
talk to hardware directly, it only issues I/O requests to the IB stack, if the
hca/IB stack powers down, the IB stack is supposed to queue any I/O requests
and when power is reenabled, process those queued requests (which possible get
an error since the remote end may now be gone). The behavior we see is the IB
stack crashes if we continue to do I/O requests when the hca/IB stack have powered
down. It seems like the IBAL interface should act similar to the TDI interface
of TCP/IP. Kernel clients doing TCP communication aren’t required to
close communication with remote endpoints on system sleep before the NIC powers
down, although some clients may want to do this (like a storage stack that
wants to flush buffers before the communication fails).<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>Since we have no power irp ordering relationship with the IB stack
, and the IB stack can’t cope with I/O request when it’s powered
down, about all we can do is make our kernel client driver fail any OS attempt
at standby/hibernate the system. At system shutdown, our kernel client
registers for last change shutdown notification, so is able to end all
communication before the hca is powered down.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>Our belief is the right way to handle all this is to make our
kernel client become a PnP child of the hca/IB bus. To do this, it seems like
we will need to modify the IB stack drivers to instantiate an arbitrary list of
PnP id’s (read from the registry) as IB bus children, for every IB port,
very much like the IPoIB pdo is hard coded today. This will cause the OS to
load our PnP kernel client fdo device in the PnP tree of the hca. If we do
this, the OS will then power down all the child stacks before it powers down
the hca/IB Bus. I assume other kernel clients have the same problem. <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>A significant effect of making the IBAL interface driver a filter
on the hca, is that if you have multiple hca’s, you’ll have
multiple IBAL interface devices, which can’t all have the same name. This
may be a problem for some kernel and user clients if there is no longer a
single IBAL interface device. <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>Having a single root IBAL interface device has always seemed
problematic for kernel clients for other reasons. One is that some IB requests
use physical addresses, which only have meaning relative to a specific hca on a
specific PCI bus. There is no guarantee that processor physical address == hca
bus physical address, so addresses may need to get mapped at a layer that knows
what happens in the PCI bridges. There is currently an API to retrieve the
actual hca device on an IB port, which allows a kernel client to obtain the
correct DMA adapter object which is needed to correctly map virtual to physical
addresses. Devices that are instantiated as children of the hca, get the
benefits of hidden API’s, like QUERY_INTERFACE for
GUID_BUS_INTERFACE_STANDARD to get the DMA adapter that matches the hca. <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>We would certainly like to see the IB stack structure move in a
direction that would accommodate kernel mode clients, which are not created via
IB IOC detection protocols, and allow them to manage power states correctly. <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'>- Jan<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy'><o:p> </o:p></span></p>
</div>
</body>
</html>