[openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq
Sean Hefty
mshefty at ichips.intel.com
Thu Nov 2 11:13:43 PST 2006
> We had an option to increase the RQ size for QP1 and QP0.
> This might help you too: try increasing IB_MAD_QP_RECV_SIZE.
Actually, dropping the requests actually helps the scalability.
If nothing gets dropped, the backlog of queued requests grows to hundreds of
thousands, most of which will have timed out before the SA can get around to
processing them.
One option is having the SA (or ib_umad?) return a busy status in response to a
MAD, but we'd still have to be able to send this response as quickly as requests
are being received. We could then limit the number of requests that would be
queued in the kernel for a user.
Unfortunately, when we are able to run on the cluster, modifying the kernel
modules isn't available to use...
- Sean
More information about the general
mailing list