[openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

Arlin Davis ardavis at ichips.intel.com
Thu Nov 2 15:17:15 PST 2006


Michael S. Tsirkin wrote:

>>Another great option would be to have path record caching. Unfortunately 
>>OFED 1.1 did not include ib_local_sa in the release.
>>
>>    
>>
>
>This won't help you much.
>With 256 nodes all to all already gives you 65000 requests
>which is the same order of magnitude as the reported 130000.
>
>  
>
Am I missing something here? 65,000 requests every 15 minutes (current 
default) for the entire cluster versus 100-130000 every time I start an 
application is a big help. Especially on a very large cluster that is 
batching up smaller independent jobs sharing a single SA and fabric. We 
either need caching or SA capabilities that can scale up with large 
clusters. A single service running at 6000 requests/second will not succeed.

-arlin.




More information about the general mailing list