[ofw] Support boot device in IB stack?
Fab Tillier
ftillier at windows.microsoft.com
Thu Feb 12 15:26:04 PST 2009
There's a section in the WDK, under the Storage Miniport Drivers section called "Restriction on Miniport Drivers that Manage the Boot Drive".
That's all I've found so far...
> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org [mailto:ofw-
> bounces at lists.openfabrics.org] On Behalf Of Tzachi Dar
> Sent: Thursday, February 12, 2009 1:57 PM
> To: James Yang; Leonid Keller; ofw at lists.openfabrics.org
> Subject: RE: [ofw] Support boot device in IB stack?
>
> Hi James,
>
> There are a few things to understand:
> 1) Spinlocks are used for synchronization. We can skip the spin locks,
> But we can not skip the need for synchronization...
>
> If someone was sending using a QP, another instance can not start
> sending
> because this will lead to data corruption.
>
> Next, I don't really understand how the driver moves to polling.
> What will happen if we queue a DPC? Will that execute or not?
>
> I believe that if spinlocks are our only problem we will be able to
> take them even at level 31. If we will see that this is not enough we
> will have to create another instance of the functions without locks.
>
> Finding MS docs on this issue is indeed difficult. I did some search
> and
> found:
> http://download.microsoft.com/download/f/0/5/f05a42ce-575b-4c60-82d6-
> 208
> d3754b2d6/Writing-ATAport-Miniport.ppt
> this has tips for crushdump:
> "Create an unused copy of the miniport at boot time"
> I guess that this makes sure that no one will be using the minidump
> when
>
> The system crashes.
>
> I have also found:
> http://download.microsoft.com/download/9/c/5/9c5b2167-8017-4bae-9fde-
> d59
> 9bac8184a/RAID_design.doc#_Toc150055726 that says that:
> "For more information about crashdump support in miniports, see the WDK
> documentation"
> I have looked at the WDK documentation and found nothing, but I guess
> that this gives some
> hope.
>
> Lets see if we can find more information from ms about this issues.
>
> Thanks
> Tzachi
>
>
> > -----Original Message-----
> > From: James Yang [mailto:jyang at xsigo.com]
> > Sent: Thursday, February 12, 2009 9:03 PM
> > To: Tzachi Dar; Leonid Keller; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Support boot device in IB stack?
> >
> > Tzachi,
> >
> > I agree you analyze on the spinlock object. For BSOD, we can
> > just simply skip calling OS spinlock in cl_spinlock_acquire()
> > when we find out IRQL==31 because the code is not re-entry at
> > IRQL 31. At BSOD, everything has to rely on polling. Even OS
> > is polling interrupt. The only functions we want to support
> > are to make sure we can still send out packets.
> >
> > -James
> >
> > -----Original Message-----
> > From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> > Sent: Thursday, February 12, 2009 1:47 AM
> > To: Leonid Keller; James Yang; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Support boot device in IB stack?
> >
> > One more note please:
> >
> > Spinlocks are very simple objects of the operating system.
> > In short the only thing that they have is a place in memory
> > on which interlocked operations are being done. As such they
> > will work on any IRQL.
> >
> > So, one might have two arguments against what I just said:
> > 1) The docs say not to use them at elevated IRQL.
> > 2) One can try using them at higher IRQL and see that the
> > system gets into a deadlock very fast.
> >
> > So, what is the explanation to this issue?
> >
> > The long answer is this:
> > Spinlocks are not re-enterable. That means that if one is
> > holding a spinlock, and then he will try to acquire that
> > spinlock again, then he is stacked. This means that he will
> > spin for ever and have no chance of freeing the lock in the
> > first place.
> > The operating system answer to this problem comes from two places:
> > 1) Don't take a spinlock twice. (this means that make sure
> > that your code is not using the lock twice).
> > 2) Make sure that your code is not pre-emptied. This is
> > usually achieved by the fact that all users of a spinlock are
> > running at a DISPATCH_LEVEL and if one is running at a higher
> > level he can not take the spinlock.
> > (Actually it means something else: find the highest IRQL that
> > is used to
> >
> > take the spinlock and make sure that all are using that level).
> >
> > So, where does it put us?
> > In short our code relies on synchronization. We are using
> > spinlocks to achieve this synchronization. In the general
> > case, we can not live without synchronization.
> >
> > So what happens in the case of BSOD: obviously there are no
> > functions that can be called at such an elevated level. This
> > means that in the general case the driver will stop working.
> > The only thing that might really help is if we can know the
> > special rules that apply to the case of BSOD. We should also
> > know what we are being expected to do.
> > Currently our driver is based on the fact that operations are
> > being completed by interrupts and DPCs. We *can* also work by
> > pooling instead. What we really should know is that is the
> > threading model that is happening at a BSOD time (will
> > interrupts work, will threads work, will running DPCs have a chance
> to
> > finish) and what we should do (only write to a remote
> > destination? Also read and act accordingly?)
> >
> > Thanks
> > Tzachi
> >
> >
> > > -----Original Message-----
> > > From: Leonid Keller
> > > Sent: Thursday, February 12, 2009 10:23 AM
> > > To: James Yang; Tzachi Dar; ofw at lists.openfabrics.org
> > > Subject: RE: [ofw] Support boot device in IB stack?
> > >
> > > > In order to support crashdump, the IB related code must be
> > > able to run at IRQL 31.
> > > Could you explain why do you think so ?
> > > As far as I know, at IRQL 31 all interrupts/dpc/threads are
> blocked.
> > > When a crash happens, the only task of OS is to create a
> > dump, while
> > > preserving the current state.
> > > So it looks strange to me, that OS will allow to any driver to
> > > continue working after a crash.
> > > As to creating a crash dump at IRQL 31, it looks a bit
> > strange to me.
> > > OS have to write the dump to HD, which presumes the work with disk
> > > driver at IRQL 2 at worst.
> > > Am I missing something ?
> > >
> > > > -----Original Message-----
> > > > From: ofw-bounces at lists.openfabrics.org
> > > > [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of James
> Yang
> > > > Sent: Thursday, February 12, 2009 1:59 AM
> > > > To: Tzachi Dar; ofw at lists.openfabrics.org
> > > > Subject: RE: [ofw] Support boot device in IB stack?
> > > >
> > > > Hi Tzachi,
> > > >
> > > > The boot should be OK if #1 and #3 are resolved. In order
> > > to support
> > > > crashdump, the IB related code must be able to run at IRQL
> > > 31. It will
> > > > be good to know what functions are OK to run and what are
> > > not. By the
> > > > way, according to DDK, "Callers of KeAcquireSpinLock must
> > > be running
> > > > at IRQL <= DISPATCH_LEVEL."
> > > >
> > > > There is no formal document on crashdump. The most
> > > important thing is
> > > > that it's running at IRQL 31, and multiple threading may
> > > not work at
> > > > this time.
> > > >
> > > > iSCSI is one of the solutions, but we also have to support vhba
> > > > directly. And in order to get MS WHQL for vhba, we must support
> > > > crashdump. Do you get any request to support crashdump on
> > > iSCSI/ipoib?
> > > >
> > > > Thanks,
> > > > James
> > > >
> > > > -----Original Message-----
> > > > From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> > > > Sent: Wednesday, February 11, 2009 12:55 AM
> > > > To: James Yang; ofw at lists.openfabrics.org
> > > > Subject: RE: [ofw] Support boot device in IB stack?
> > > >
> > > > Hi James,
> > > >
> > > > As for 1,3 this are technical changes that can easily be done.
> > > >
> > > > Issue #2 seems much more problematic to me.
> > > > First I must say that I haven't studied the topic
> > thoroughly, so I
> > > > might be getting a wrong impression of the entire issue.
> > > > One more thing to start with, is that I'm not sure that
> > > this request
> > > > is a must. What I mean by that is that booting will take
> > place just
> > > > fine.
> > > > The one thing that will not work is creating a crash dump
> > > file later.
> > > >
> > > > In any case, spinlocks don't really have a problem with
> > higher IRQL
> > > > (the way I understand it), but many other commands have.
> > So if the
> > > > code is only trying to send/receive data with qps that are open I
> > > > guess that things should work.
> > > > Trying to open new QPs will not work (we have to wait).
> > > > Please also note that there are only about 5 commands that can be
> > > > called from IRQL>DISPATCH_LEVEL.
> > > >
> > > > Can you please send a reference to the document that says
> > > what are the
> > > > demands after the system has crashed?
> > > >
> > > > Another question is this, will booting using iSCSI be a
> > > better option
> > > > for you? (not that it was done using ipoib).
> > > >
> > > > Thanks
> > > > Tzachi
> > > >
> > > > > -----Original Message-----
> > > > > From: ofw-bounces at lists.openfabrics.org
> > > > > [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of
> > James Yang
> > > > > Sent: Tuesday, February 10, 2009 11:31 PM
> > > > > To: ofw at lists.openfabrics.org
> > > > > Subject: [ofw] Support boot device in IB stack?
> > > > >
> > > > > Hi,
> > > > >
> > > > > We would like to support boot device especially SAN boot
> > > > over IB. So
> > > > > far I can see the following issues in current code:
> > > > > 1) Many functions are page-able. If these functions are
> > > running at
> > > > > boot or shut down time, the disk may not be ready, the
> > > paging won't
> > > > > work.
> > > > > 2) Spinlock may not work for crashdump, whose
> > > > > IRQL>DISPATCH_LEVEL. Any other functions need to be changed
> > > > > when IRQL>DISPATCH_LEVEL?
> > > > > 3) All related drivers should be boot start driver.
> > > > >
> > > > > Are there any other potential problem? Is there any plan
> > > to support
> > > > > SAN boot?
> > > > >
> > > > > Please advice.
> > > > >
> > > > > Thanks,
> > > > > James
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > ofw mailing list
> > > > > ofw at lists.openfabrics.org
> > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > > > >
> > > >
> > > > _______________________________________________
> > > > ofw mailing list
> > > > ofw at lists.openfabrics.org
> > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > > >
> >
> >
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
More information about the ofw
mailing list