[Openib-windows] Windows DMA model

Leonid Keller leonid at mellanox.co.il
Wed Oct 19 04:39:24 PDT 2005


FYI: The new low level driver, i'm working no, is using IoGetDmaAdapter and
AllocateCommonBuffer to perform the mappings right ...

> -----Original Message-----
> From: Fab Tillier [mailto:ftillier at silverstorm.com]
> Sent: Wednesday, October 19, 2005 9:15 AM
> To: 'Jan Bottorff'; openib-windows at openib.org
> Subject: RE: [Openib-windows] Windows DMA model
> 
> 
> > From: Jan Bottorff [mailto:jbottorff at xsigo.com]
> > Sent: Tuesday, October 18, 2005 10:24 PM
> > 
> > Why oh why does my ctrl-v sometimes send my half written email
> > message... anyway...
> 
> Thanks for resending, and also for bringing up these issues.  I really
> appreciate the feedback.
> 
> > I asked Microsoft about just calling MmGetPhysicalAddress 
> for DMA and
> > they responded:
> > 
> > =======================
> > To summarize, your drivers will break on chipsets that need 
> extra cache
> > coherence help and on virtualized systems where there is an I/O MMU.
> > Neither of these is particularly common today, but they'll 
> be much more
> > common in the near future.  The drivers will also break on non-x86
> > machines where DMA address don't equal CPU-relative 
> physical address,
> > but those machines have become very uncommon in the last five years.
> 
> I wholeheartedly agree with this assessment.  Note that support for
> virtualization will require a whole lot of work - to support 
> kernel bypass in a
> virtual machine, where the application in user-mode in the 
> virtual machine has
> to bypass both the virtual machine kernel as well as the 
> host's kernel.  It
> would be great to figure out how to do this in Windows.  I 
> currently don't
> really have a clue, though.
> 
> I look forward to any input you might have as we try to find 
> solutions to these
> current deficiencies.
> 
> > >I don't know if you saw my RFC emails about that API or not.
> > 
> > I didn't see that, as I'm a very new member of the list. It 
> sounds like
> > your saying the current low level interface API to things will be
> > changing in the future?
> 
> Yes, the interface between the access layer and the HCA HW 
> driver will change at
> first, to be followed by the ULP to Access Layer interface.  
> I'll be getting
> back to that soon I hope, and will be sending out headers for 
> comments.
> 
> > >Also, driver verifier will completely break DMA for user-mode as it
> > >forces double buffering to check that DMA mappings are 
> used properly.
> > 
> > If you turn on DMA verification in driver verifier I believe it will
> > double buffer ALL correctly done DMA, to help find memory boundary
> > violations. This is also a check of the hardware.
> 
> That's correct.  Kernel clients should definitely do all DMA 
> operations by the
> book.  The question is whether registrations (both user and 
> kernel) should use
> the DMA mapping functionality, or just use the physical 
> addresses from the MDL
> after it has been locked down.  The former will result in 
> verifier breaking
> anything that uses registered memory, and the latter will 
> result in broken DMA
> due to the assumption that CPU and bus addresses are 
> consistent and cache
> coherent.  I have doubts that kernel bypass could even work 
> without cache
> coherency, though.
> 
> For example, for internal ring buffers like those used for 
> CQE and WQE rings,
> performing proper DMA mappings will break the hardware if 
> verifier remaps these.
> I suppose a way around that is to allocate those buffers one 
> page at a time with
> AllocateCommonBuffer, build up an MDL with the underlying CPU 
> physical pages
> using MmGetPhysicalAddress on the returned virtual address, 
> remap it to a
> contiguous virtual memory region using 
> MmMapLockedPagesSpecifyCache, and then
> use the bus physical addresses originally returned by 
> AllocateCommonBuffer to
> program the HCA.  I don't know if this sequence would work 
> properly, and it
> still doesn't solve the issue of an application registering 
> its buffers.
> 
> > Passing tests with driver verifier active will be required 
> to obtain any
> > kind of WHQL driver certification. Commercial users will 
> absolutely need
> > these certifications.
> 
> Agreed, but any WHQL certifications require Microsoft to define a WHQL
> certification process for InfiniBand devices.  That said, 
> even without an
> official IB WHQL program, the WHQL tests are a valuable test 
> tool, as is
> verifier.  Once Microsoft has a program for IB, I expect 
> they'll have thought of
> how to handle kernel bypass and DMA mappings for memory registrations.
> 
> > >There is some work that needs to happen to get MAD traffic 
> to do proper
> > >DMA mappings, but upper level protocols already do the right thing.
> > 
> > I can understand how higher level packet traffic from a 
> kernel mode NDIS
> > based driver or buffers from a STORPORT based storage 
> driver can have
> > the correct mapping already. It also sounds like other 
> kernel drivers
> > that use the IBAL interface currently aren't assured DMA is correct.
> 
> It's up to the client to do DMA mappings since the client 
> posts work requests to
> their queue pairs.  The problem is that it's not clear for 
> which device to get a
> DMA_ADAPTER and the new interface will make that much clearer.
> 
> The only part where IBAL is deficient with respect to DMA 
> mappings is for MADs.
> Anything else is the responsibility of the client.  There's 
> no clean way to make
> IBAL know exactly how to perform DMA mappings for all users 
> automatically.
> 
> > >For the time being, since we're running on platforms where 
> the CPU and
> > >bus addresses are consistent, it hasn't been an issue.
> > 
> > Microsoft seems to say it's more than just an address mapping issue;
> > it's also a cache coherency issue. I'm not surprised that 
> it's desirable
> > to get software to help with cache coherency as the number 
> of processor
> > cores grows, especially on AMD processor systems with 
> essentially a NUMA
> > architecture.
> 
> Aren't the AMD processors cache coherent, even in their NUMA 
> architecture?
>  
> How do you solve cache coherency issues without getting rid 
> of kernel bypass?
> Making calls to the kernel to flush the CPU or DMA controller 
> buffers for every
> user-mode I/O is going to take away the benefits of doing 
> kernel bypass in the
> first place.  That's not to say we won't come to this 
> conclusion, I'm just
> throwing the questions out there.  I'm not expecting you to 
> have the answers -
> they're just questions that I don't know how to answer, and I 
> appreciate the
> discussion.
> 
> - Fab
> 
> _______________________________________________
> openib-windows mailing list
> openib-windows at openib.org
> http://openib.org/mailman/listinfo/openib-windows
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20051019/f667beb2/attachment.html>


More information about the ofw mailing list