[Fwd: [ofa-general] More responder_resources problems]

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Tue Apr 22 21:21:15 PDT 2008


On Tue, Apr 22, 2008 at 08:58:21PM -0700, Sean Hefty wrote:
 
> >But the whole point of this process is to get a working connection -
> >the responder resources are not a ULP visible item, they are just
> >something that must be negotiated and configured into the QP. In
> >truth, I can think of no reason for a ULP to use any value other than
> >the device maximum or 0 for these resources. Saying that if the
> >passive side messes up it will just die when the QP is modified is,
> >IMHO, not good enough.
> 
> For the IB CM, the policy controlling the use of those fields is given to the
> ULP.  A check could be added to ib_send_cm_rep to fail if the ULP tries to use a
> value higher than that in the REQ.  I would not have the CM automatically
> replace the user's values with its own.

Well, what if we just made this simpler for the ULP. The kernel, when
it receives and REQ will modify the values as it swaps them so they do
not exceed the device maximum. The ULP can then further modify them if
it wants, but does not have to do anything more than copy them into
the REP to get correct function. This seems to handle the ULPs I have
looked at..

> For the RDMA CM, there's no guarantee that the initiator_depth and
> responder_resources are available in the connection request.  With iWarp, the
> values are not available unless embedded somewhere in the private data.

I am told that iWarp does not have this concept. The iwarp protocol
does not require a limit on the number of un-acked RDMA READS/Atomics in
flight. Only IB does, so ignoring the values entirely on iWarp seems
fine to me..

> >Where? cma.c never programs max_rd_atomic in the qp.
> 
> rdma_accept() takes the responder_resources and initiator_depth as part of its
> input parameter.  These are passed to the CM, which end up being used when
> getting the modify QP attributes.

Hmmmmm, so that goes into the kernel cm_format_req_event, which saves
it for cm_init_qp_rts_attr to later recover. Gotcha.

It is unfortunate that the RTS transition cannot set both
initiator_depth and responder_resources, it makes this awkward in the
ULP.

> >Well, what I have been interested in (Hal - what is your interest
> >here?) is to use the device maximum and get rid of the hard coded
> >values for responder resources and initiator depth in the ULPs. This
> >would be to allow some devices to have higher responder resources,
> >based on hardware capabilitity. Limited responder resources cause huge
> >performance problems on high latency connections.
> 
> To make it easier on the active side, we could allow the user to specify some
> 'MAX_RDMA' value that either the rdma cm or ib cm can key off of.  The cm could
> then request initiator_depth and responder_resources based on the local HW
> maximums.  The passive side could also specify MAX_RDMA, which for IB would
> negotiate down to the values in the REQ and the local HW resources.

Just setting the value to maximum in the REQ is not enough without the
passive side limiting it to the device capabilities. That is where I
started - it is easy to query to device and get the maximum, but just
putting those values in the REQ causes one side to try to use more
responder resources than it has. (initiator depth is 128 and responder
resources are 4 in my test HCAs here)

I do think that a MAX_RDMA value for the rdmacm especially is a pretty
good idea. The rdmcm is already holding onto the device attributes
structure. It could also automatically limit it based on the sendq
length.

Jason



More information about the general mailing list