[Openib-windows] Adding an environment variable for determining the initial CQ size

Tzachi Dar tzachid at mellanox.co.il
Mon Jul 17 14:56:27 PDT 2006


Hi Fab,

The reason that I have called it a leak, is that currently, if some
application opens 500 connections, and than closes them, what will
happen is that 5 threads will be created and once all sockets are
destroyed, there will be 5 threads with empty sockets. If that
application will open another 500 connections, 4 new threads will be
opened, and the numbers will continue growing forever. So this is some
kind of a leak, as I understand this.
It seems that the only way to solve this problem is to create a
mechanism that will make sure that no new CQ and CQ thread is used until
there is no place in other CQs.

This requires putting the CQ's in some list and this problem will
probably have to be solved in any case, some time in the future.

Currently, the situation is that the user has no mitigation to this
problem at all. With the fix that I suggested, you will be able to set
the number of connections and this does give some way for a user to work
well. Of course, when the CQ's will be resized as needed, this will give
a better mitigation to the problem but not solve it anyway.

In any case there is not much to argue. I believe that we can agree that
the chances of a real user reaching this problem is small. The fix is
not really a fix. We can always compile this for him, and on the other
hand the risk in this patch is very small.

Thanks
Tzachi 

> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] 
> On Behalf Of Fabian Tillier
> Sent: Monday, July 17, 2006 6:15 PM
> To: Tzachi Dar
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] Adding an environment variable 
> for determining the initial CQ size
> 
> Hi Tzachi,
> 
> On 7/13/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
> > Fab,
> >
> > We do plan to add resize CQ support but since it is quite a 
> large  and 
> > risky patch we prefer not to do that before the coming release.
> > The WSD provider implementation will open another CQ in 
> case there is 
> > no more place available and will not free the CQ if it is 
> empty we may 
> > get to a leakage scenario. A larger CQ helps to reduce 
> this. There is 
> > no limitation for large cluster other then this possible leakage.
> 
> It's not really a leak, as the CQ is tracked - it just means 
> a CQ is allocated, and its thread is in a wait state.  I want 
> to be careful about the language here as calling it a leak 
> makes it sound like a bug, whereas it is not.
> 
> > Once again, we will defiantly add resize CQ to MTHCA but I 
> doubt if it 
> > will be ready for the coming release.
> 
> Right, understood.  I don't think we should delay the coming 
> release to add resize_cq support.
> 
> > Each CQE entry is 32B. This mean that the memory allocated 
> per CQ is 
> > 32
> > * 512 * 32 = 512KB
> 
> Would you mind just changing the initial CQ size to 500, 
> without adding the environment variable?  This would reduce 
> the number of CQs and associated threads in large runs.
> 
> I don't really want to add a tunable for this as users are 
> unlikely to know how to properly use this variable, and it is 
> going to be short lived anyhow.
> 
> Thoughts?
> 
> - Fab
> 




More information about the ofw mailing list