[openib-general] Re: CMA backlog
Michael S. Tsirkin
mst at mellanox.co.il
Tue May 30 11:34:54 PDT 2006
Quoting r. Sean Hefty <sean.hefty at intel.com>:
> Subject: RE: CMA backlog
>
> I think that there are some issues that would need to be worked out, but in
> general I'm in favor of trying to do something here.
>
> >Currently, this is not something that can be implemented by ULP on top of
> >CMA, because returning error from REQ will result in reject rather than REQ
> >drop.
>
> A generic ULP could handle this by making use of the private data, and retrying
> requests after a REJ with insufficient resources.
>
> >CMA already has backlog parameter in listen but it is ignored as far as I can
> >see. I propose extending cma API with the following options:
>
> The backlog applies more for iWarp and userspace. I couldn't find a usable way
> to make use of backlog in the kernel, since it uses a callback model.
>
> >rdma_backlog_added - connection was added to backlog queue
> >rdma_backlog_removed - connection was removed from backlog queue
>
> *ponders*
>
> >Internally, CMA will count the # of connections in backlog. If
> >If REQ arrives and this number exceeds the backlog given in listen,
> >CMA will drop the REQ, without creating the new CMA ID.
>
> Incrementing the number of pending connections on a listen is easy.
> Decrementing it is more difficult, since a listen request can be destroyed after
> a connection request is received, but before it is responded to. This is
> difficult to handle, especially for userspace clients.
That is why, in my opinion, this should be up to the ULP to handle,
calling rdma_backlog_added/rdma_backlog_removed as appropriate.
Existing ULPs that don't call rdma_backlog_added will simply
get all requests.
> Additionally, the CMA can't just drop the REQ. The REQ has been received by the
> IB CM, which is expecting a response. You would need to push backlog into the
> IB CM, which requires defining what it means at that level. From the
> perspective of the IB CM, sending a REJ with "No resources available" (reject
> code 3) seems to make more sense than simply discarding the MAD.
This approach would affect all ULPs, however. For example, no SDP imlementation
that I know of retries after a REJ - so this approach won't be interoperable.
And AFAIK SDP spec already interprets reject as connection refused.
There's no provision I cansee in SDP spec for retries on specific
reject code.
Dropping REQ simply seems a nice approach since client retries REQ MADs anyway.
> One possible fix is to remove sending a reject on destruction of a cm_id. I'm
> not sure what effect this would have on other code or the overall protocol
> though.
Yes, that was my thinking. To avoid touching all users, maybe the simplest way
is to make ib_cm discard the new cm_id without reject if the client callback
returned -ENOMEM?
If you consider that in out of memory situation sending reject will also likely
fail, this might be a good idea, regardless.
Sounds good?
--
MST
More information about the general
mailing list