[openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

Sean Hefty mshefty at ichips.intel.com
Thu Nov 2 16:22:38 PST 2006


Michael S. Tsirkin wrote:
> This won't help you much.
> With 256 nodes all to all already gives you 65000 requests
> which is the same order of magnitude as the reported 130000.

A cache for 256 nodes only generates 256 requests.  Each request is a get table 
from a given sgid.  The all to all connection model generates n^2 requests 
because each request is a get for a given sgid/dgid pair.  Additionally, cached 
requests can be done when the application isn't running, with a fairly long or 
infinite update time.

Arlin and I have discussed some caching options, including having multiple cache 
service daemons running on the subnet.  If more than service is running, a nodes 
can select a particular service to communicate with.  Communication can be done 
using RC to reduce the MAD overhead.

- Sean




More information about the general mailing list