[Openib-windows] Fabric usage at boot time

Fab Tillier ftillier at silverstorm.com
Mon Oct 31 20:49:22 PST 2005


Hi Jan,

> From: Jan Bottorff [mailto:jbottorff at xsigo.com]
> Sent: Monday, October 31, 2005 5:50 PM
> 
> A GREAT start would be to make all the fabric drivers needed to access
> storage be boot load drivers (or at least know they work correctly when
> loaded at boot time). This may be really easy or it may be really hard.
> At OS startup time there are some limitations on the environment drivers
> run in. There is unfortunately no documentation I know of that tells you
> these limitations. It also varies a bit with different OS versions (W2K
> vs. W2K3).

I think looking over the usage of paged pool should be a good start.  I think
the only portion of the driver that uses paged pool is the HCA driver when it
allocates QP and CQ ring buffers for WQEs and CQEs, respectively.  This was done
because using non-paged pool would seriously limit scalability when running
without the user-mode HCA DLL (and thus all user-mode APIs turned into IOCTL
calls).  There's probably smarter ways to distinguish kernel and user clients,
even in the absence of a user-mode HCA DLL, that would support boot-time use of
the drivers.

> When working on drivers that have to work at boot time I generally
> alternate between setting them to normal demand load drivers (which are
> a bit easier to test and debug), and then periodically setting them to
> be boot load, and see what needs fixing (usually by seeing what API's
> fail).

I'll add this to my queue of things to test and work on.  When using boot-loaded
drivers, does the PnP manager still enumerate the drivers and start devices as
with demand loaded drivers?

> There also are considerations for putting paging files on the fabric.
> For example, if your supporting a paging file you can't take any page
> faults while reading or writing the paging file. This probably means you
> can't take any page faults to reestablish a connection to a boot disk
> which has temporarily failed or during normal read/writes. Booting from
> the fabric implies paging to the fabric. It also would be desirable to
> write crash dumps to the fabric boot disk, which has it's own set of
> issues.

Special file support introduces a whole series of issues that the current HCA
driver doesn't handle.  Specifically, when a paging file is on a disk, power
management IRPs come down at DISPATCH_LEVEL.  The current HCA driver code
assumes that all power management IRPs are sent down at passive level.  Pretty
much everything in the HCA driver assumes calls at PASSIVE, which is something
I'm hoping to fix but getting little support for.

> Windows 200x also can't boot using PXE. You can install to a local hard
> disk using PXE. There MUST be INT 13 block I/O services provided by the
> adapter rom to access the boot disk (or EFI bios equivalents on
> appropriate systems). NTLDR (loaded by the bios from the system disk)
> will want to load the OS kernel, boot.ini, the system hive of the
> registry and all drivers marked as boot start. NTLDR talks INT 13 to the
> boot disk to read these files. Control is then transferred to the OS
> kernel, and these boot loaded drivers MUST start and be capable of
> accessing the boot disk without real mode INT 13 help. The TCP/IP stack
> is not a boot load driver, so you can't boot using IPoIB (unless you
> supply your own TCP/IP stack as part of a storage driver).

It's my understanding that the whole network stack in Windows 200x depends on
pageable memory, and thus cannot be used at boot time.

> Just to be
> squeeky correct, you could have a single unified storage miniport driver
> be used by NTLDR to access the boot disk (this is usually a carefully
> written scsi miniport driver copied to the system disk as ntbootdd.sys).
> Since the IB hardware drivers+fabric drivers+storage miniport are not a
> single driver, this method doesn't work.

I think having a single boot driver like this and then figuring out how to do a
clean handoff to the "normal" driver stack would probably be quite complicated.
The flip side is that with separate drivers users have to install each driver as
a separate boot-time driver when installing the OS (assuming they're installing
from the CD).

Thanks for the information.  I'm looking forward to your input when I send out
the async kernel verb API for review. I'm hoping to get back to that after the
SuperComputer activities settle down.

Thanks for the ongoing feedback,

- Fab




More information about the ofw mailing list