[openib-general] Question about QP's in timewait state and CM stale conn rejects

Or Gerlitz ogerlitz at voltaire.com
Sun Aug 20 04:30:17 PDT 2006


Sean Hefty wrote:
> Or Gerlitz wrote:

>> If you don't mind (also related to the patch you have sent Eric of 
>> randomizing the initial local cm id) to get into this deeper, can we do 

> There's an issue trying to randomize the initial local CM ID.  The way 
> the IDR works, if you start at a high value, then the IDR size grows up 
> to the size of the first value, which can result in memory allocation 
> failures.  In my tests, using a random value would frequently result in 
> connection failures because of low memory.

> My conclusion is that the local ID assignment in the IB CM needs to be 
> reworked, or we will run into a condition that after X number of 
> connections have been established, we will be unable to create any new 
> connections, even if the previous connections have all been destroyed.

How about (for the meantime, till this rework is designed && done) going 
to projecting the initial random local id into the range of (say) 
[0-1022] (i think 1023 is prime, if not choose a prime near it) this way 
with very good probability and with very little overhead on memory 
consumption a client connect/reboot/"reconnect" would work.

Or.





More information about the general mailing list