[openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq
Arlin Davis
ardavis at ichips.intel.com
Thu Nov 2 15:17:15 PST 2006
Michael S. Tsirkin wrote:
>>Another great option would be to have path record caching. Unfortunately
>>OFED 1.1 did not include ib_local_sa in the release.
>>
>>
>>
>
>This won't help you much.
>With 256 nodes all to all already gives you 65000 requests
>which is the same order of magnitude as the reported 130000.
>
>
>
Am I missing something here? 65,000 requests every 15 minutes (current
default) for the entire cluster versus 100-130000 every time I start an
application is a big help. Especially on a very large cluster that is
batching up smaller independent jobs sharing a single SA and fabric. We
either need caching or SA capabilities that can scale up with large
clusters. A single service running at 6000 requests/second will not succeed.
-arlin.
More information about the general
mailing list