[ofw] winverbs ND provider

Tzachi Dar tzachid at mellanox.co.il
Thu Jan 14 08:38:56 PST 2010


 

> -----Original Message-----
> From: Fab Tillier [mailto:ftillier at microsoft.com] 
> Sent: Tuesday, January 12, 2010 8:29 PM
> To: Tzachi Dar; Sean Hefty; ofw_list
> Subject: RE: [ofw] winverbs ND provider
> 
> Hi Tzachi
> 
> Tzachi Dar wrote on Tue, 12 Jan 2010 at 09:47:23
> 
> > In the design that you have
> > used, there is a working thread of the program, than there 
> is a thread 
> > in the completion queue. This means that after the kernel has 
> > completed a request, the completion queue thread wakes up sets the 
> > event and the next thread wakes up.
> 
> I'm not sure I follow - there are threads in the WinVerbs ND provider?

My mistake, I was looking at the rdma cm code.

> > Please also
> > note the use of overlapped operations creates an IRP in the kernel 
> > which is also not that efficient. I guess that all this can 
> save about 
> > 10us per operation, and I would like to see the number of connects 
> > that you make to understand if this is really the problem.
> 
> This is no different than the IBAL provider, though.

For the ibal nd provider you are correct. For the normal ibal verbs this
is not correct.
The good old ibal has managed to do arm without going from user to
kernel.

> 
> >> If anyone is aware of other performance differences 
> between the two 
> >> providers, please let me know as soon as possible.  I will 
> continue 
> >> to stress the code for the next couple of days before 
> committing the 
> >> changes to the trunk.
> >> 
> > There is one issue that I have noticed. This is very likely not 
> > related to the ndconn test but it still bothers me:
> > There are a few operations that by pass the os. For example 
> send and 
> > receive but also notify. In the winverbs design this is not 
> the case.
> > That is notify does go to the kernel. See 
> CWVCompletionQueue::Notify.
> 
> Notify must go to the kernel - it's an async operation.  It 
> goes to the kernel with the IBAL ND provider too.
> 
> > I guess that there are two reasons why this problem is not that big:
> > 1) performance oriented tests don't arm the CQ, they poll.
> > 2) if arm_cq was called than when there is a new event the 
> kernel will 
> > have to be involved.
> > I understand that it will be very hard to change that, but I guess 
> > that for the long time we will have to think of something, 
> since IBAL 
> > was able to eliminate that.
> 
> How was IBAL able to eliminate kernel transitions for CQ 
> notifications?

One of the mechanism that IBAL was using was events that were shared
between the user and kernel.
When the kernel has to signal something it signals the events, this is
much more efficient.
On the other hand it is not as general as the overlapped mechanism.

> 
> > There is another issue that one should think of: Using fast 
> dispatch 
> > routines instead of the normal dispatch routines. This technique 
> > decreases the time it takes to move from user to kernel 
> dramatically.
> > Since the nd_conn test is moving many times from user to kernel it 
> > will improve things.
> 
> The IBAL ND provider doesn't do this, though, so the perf 
> difference can't be related to fast dispatch.  Further, the 
> operations that are performed by the async operations 
> (Connect, CompleteConnect, Accept, Disconnect) are all 
> asynchronous.  The latter three perform QP state 
> modifications, which could thus never be handled in a fast 
> dispatch routine.  The only call that might benefit from a 
> fast dispatch handler is GetConnectionRequest, since there 
> may be a request there already.  I don't know if the fast 
> dispatch routines are beneficial when using I/O completion 
> ports, though.

It is possibale to design a completely different mechanism for doing
operations 
between the kernel and the user. It should probably save a few us for
each operation.
Since we are currently dealing with around 800us I guess that there are
many other
Issues to solve before we get there.

Thanks
Tzachi





> 
> -Fab 
> 
> 



More information about the ofw mailing list