[openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

Sean Hefty mshefty at ichips.intel.com
Thu Nov 9 07:54:17 PST 2006


Or Gerlitz wrote:
> for(j=i+1; j<n; j++)
>     dat_ep_connect(ep[j], ip-address of peer j)
> 
> 
> and then
> 
> while(there are more non established connections)
>    dat_evd_wait(...)

I'm not overly familiar with the the MPI code, so I can't comment on the 
implementation.

> OK, i recall some patch or rfc you have posted which enables a response 
> on original request match a "pending retry", basically it means that all 
> the retries use the TID of the original request, correct? am i dreaming 
> so this is indeed somewhere in the pipe to the kernel?

I have a patch that exposed the mad layer retry count up through the SA query 
code.  However, I'm not sure that it helps us all that much without additional 
changes.  Detecting duplicate requests is left as a responsibility to the 
receiver, and retries are issued using a linear timeout.

- Sean




More information about the general mailing list