[openib-general] Question about QP's in timewait state and CM stale conn rejects
Or Gerlitz
ogerlitz at voltaire.com
Sun Aug 20 04:30:17 PDT 2006
Sean Hefty wrote:
> Or Gerlitz wrote:
>> If you don't mind (also related to the patch you have sent Eric of
>> randomizing the initial local cm id) to get into this deeper, can we do
> There's an issue trying to randomize the initial local CM ID. The way
> the IDR works, if you start at a high value, then the IDR size grows up
> to the size of the first value, which can result in memory allocation
> failures. In my tests, using a random value would frequently result in
> connection failures because of low memory.
> My conclusion is that the local ID assignment in the IB CM needs to be
> reworked, or we will run into a condition that after X number of
> connections have been established, we will be unable to create any new
> connections, even if the previous connections have all been destroyed.
How about (for the meantime, till this rework is designed && done) going
to projecting the initial random local id into the range of (say)
[0-1022] (i think 1023 is prime, if not choose a prime near it) this way
with very good probability and with very little overhead on memory
consumption a client connect/reboot/"reconnect" would work.
Or.
More information about the general
mailing list