[ofw] Support boot device in IB stack?

Tzachi Dar tzachid at mellanox.co.il
Thu Feb 12 01:46:32 PST 2009


One more note please:

Spinlocks are very simple objects of the operating system.
In short the only thing that they have is a place in memory on
which interlocked operations are being done. As such they will work
on any IRQL.

So, one might have two arguments against what I just said:
1) The docs say not to use them at elevated IRQL.
2) One can try using them at higher IRQL and see that the system 
gets into a deadlock very fast.

So, what is the explanation to this issue?

The long answer is this:
Spinlocks are not re-enterable. That means that if one is holding
a spinlock, and then he will try to acquire that spinlock again, then
he is stacked. This means that he will spin for ever and have no chance
of freeing the lock in the first place.
The operating system answer to this problem comes from two places:
1) Don't take a spinlock twice. (this means that make sure that your 
code is not using the lock twice).
2) Make sure that your code is not pre-emptied. This is usually achieved
by the fact that all users of a spinlock are running at a DISPATCH_LEVEL
and 
if one is running at a higher level he can not take the spinlock.
(Actually it means something else: find the highest IRQL that is used to

take the spinlock and make sure that all are using that level).

So, where does it put us?
In short our code relies on synchronization. We are using spinlocks to
achieve
this synchronization. In the general case, we can not live without
synchronization.

So what happens in the case of BSOD: obviously there are no functions
that can 
be called at such an elevated level. This means that in the general case
the driver
will stop working.
The only thing that might really help is if we can know the special
rules that
apply to the case of BSOD. We should also know what we are being
expected to do.
Currently our driver is based on the fact that operations are being
completed
by interrupts and DPCs. We *can* also work by pooling instead. What we
really should 
know is that is the threading model that is happening at a BSOD time
(will interrupts work, will threads work, will running DPCs have a
chance to 
finish) and what we should do (only write to a remote destination? Also 
read and act accordingly?)

Thanks
Tzachi


> -----Original Message-----
> From: Leonid Keller 
> Sent: Thursday, February 12, 2009 10:23 AM
> To: James Yang; Tzachi Dar; ofw at lists.openfabrics.org
> Subject: RE: [ofw] Support boot device in IB stack?
> 
> > In order to support crashdump, the IB related code must be 
> able to run at IRQL 31.
> Could you explain why do you think so ?
> As far as I know, at IRQL 31 all interrupts/dpc/threads are blocked.
> When a crash happens, the only task of OS is to create a 
> dump, while preserving the current state.
> So it looks strange to me, that OS will allow to any driver 
> to continue working after a crash.
> As to creating a crash dump at IRQL 31, it looks a bit strange to me.
> OS have to write the dump to HD, which presumes the work with 
> disk driver at IRQL 2 at worst.
> Am I missing something ? 
> 
> > -----Original Message-----
> > From: ofw-bounces at lists.openfabrics.org 
> > [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of James Yang
> > Sent: Thursday, February 12, 2009 1:59 AM
> > To: Tzachi Dar; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Support boot device in IB stack?
> > 
> > Hi Tzachi,
> > 
> > The boot should be OK if #1 and #3 are resolved. In order 
> to support 
> > crashdump, the IB related code must be able to run at IRQL 
> 31. It will 
> > be good to know what functions are OK to run and what are 
> not. By the 
> > way, according to DDK, "Callers of KeAcquireSpinLock must 
> be running 
> > at IRQL <= DISPATCH_LEVEL."
> > 
> > There is no formal document on crashdump. The most 
> important thing is 
> > that it's running at IRQL 31, and multiple threading may 
> not work at 
> > this time.
> > 
> > iSCSI is one of the solutions, but we also have to support vhba 
> > directly. And in order to get MS WHQL for vhba, we must support 
> > crashdump. Do you get any request to support crashdump on 
> iSCSI/ipoib?
> > 
> > Thanks,
> > James
> > 
> > -----Original Message-----
> > From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> > Sent: Wednesday, February 11, 2009 12:55 AM
> > To: James Yang; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Support boot device in IB stack?
> > 
> > Hi James,
> > 
> > As for 1,3 this are technical changes that can easily be done.
> > 
> > Issue #2 seems much more problematic to me.
> > First I must say that I haven't studied the topic thoroughly, so I 
> > might be getting a wrong impression of the entire issue.
> > One more thing to start with, is that I'm not sure that 
> this request 
> > is a must. What I mean by that is that booting will take place just 
> > fine.
> > The one thing that will not work is creating a crash dump 
> file later.
> > 
> > In any case, spinlocks don't really have a problem with higher IRQL 
> > (the way I understand it), but many other commands have. So if the 
> > code is only trying to send/receive data with qps that are open I 
> > guess that things should work.
> > Trying to open new QPs will not work (we have to wait). 
> > Please also note that there are only about 5 commands that can be 
> > called from IRQL>DISPATCH_LEVEL.
> > 
> > Can you please send a reference to the document that says 
> what are the 
> > demands after the system has crashed?
> > 
> > Another question is this, will booting using iSCSI be a 
> better option 
> > for you? (not that it was done using ipoib).
> > 
> > Thanks
> > Tzachi
> > 
> > > -----Original Message-----
> > > From: ofw-bounces at lists.openfabrics.org 
> > > [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of James Yang
> > > Sent: Tuesday, February 10, 2009 11:31 PM
> > > To: ofw at lists.openfabrics.org
> > > Subject: [ofw] Support boot device in IB stack?
> > > 
> > > Hi,
> > > 
> > > We would like to support boot device especially SAN boot
> > over IB. So
> > > far I can see the following issues in current code:
> > > 1) Many functions are page-able. If these functions are 
> running at 
> > > boot or shut down time, the disk may not be ready, the 
> paging won't 
> > > work.
> > > 2) Spinlock may not work for crashdump, whose
> > > IRQL>DISPATCH_LEVEL. Any other functions need to be changed
> > > when IRQL>DISPATCH_LEVEL?
> > > 3) All related drivers should be boot start driver.
> > > 
> > > Are there any other potential problem? Is there any plan 
> to support 
> > > SAN boot?
> > > 
> > > Please advice.
> > > 
> > > Thanks,
> > > James
> > > 
> > > 
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > ofw mailing list
> > > ofw at lists.openfabrics.org
> > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > > 
> > 
> > _______________________________________________
> > ofw mailing list
> > ofw at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > 



More information about the ofw mailing list