[ofa-general] Directions for verbs API extensions
Roland Dreier
rdreier at cisco.com
Mon Apr 7 14:21:54 PDT 2008
> > There are a few discrepancies between the iWARP and IB verbs that we
> > need to decide on how we want to handle:
> >
> > - In IB-BMME, L_Keys and R_Keys are split up so that there is an
> > 8-bit "key" that is owned by the consumer. As far as I know, there
> > is no analogous concept defined for iWARP STags; is there any point
> > in supporting this IB-only feature (which is optional even in the
> > IB spec)?
> In fact there is an 8b key for stags as well. The stag is composed of
> a 3B index allocated by the driver/hw, and a 1B key specified by the
> consumer. None of this is exposed in the linux rdma interface at this
> point and cxgb3 always sets the key to 0xff.
Oops, I completely missed that in the iWARP verbs spec. Yes, the IB and
iWARP verbs agree on the semantics here, so the only issue is that the
"key" portion of L_Keys/R_Keys is only supported by IB devices that do
BMME. So we can expose this in the API without too much trouble.
> The chelsio driver supports the iwarp bind_mw SQ WR via the current
> API. In fact the current API implies that this call is actually a SQ
> operation anyway:
> > /**
> > * ib_bind_mw - Posts a work request to the send queue of the specified
> > * QP, which binds the memory window to the given address range and
> > * remote access attributes.
>
> How is the current bind_mw API not valid or correct for iwarp MWs?
> Other than being a different call than ib_post_send()?
That's the only issue. The main impact is that you can't submit an MW
bind as part of a list of send WRs. I guess it's not too severe an
issue. I don't have any strong feelings here, except that eliminating
the separate bind_mw call might be a little cleaner. On the other hand
it adds more conditional branches to post_send so maybe it's a net lose.
> > - iWARP supports "RDMA read with invalidate" send work requests,
> > while IB has no such operation. This makes sense because iWARP
> > requires the buffer used to receive RDMA read responses to have
> > remote write permission, while IB has no such requirement. I don't
> > see a really clean way to handle this except to say that apps have
> > to have "if (IB) do_this(); else /* iWARP */ do_that();" code to
> > use this in a portable way.
> Or a transport independent app can always use 2 WRs, read +
> inv-local-stag/fenced instead of read-inv-local-stag.
Except that fenced local invalidate is optional on IB ;)
But as I said I think we can assume that IB devices that support local
invalidate support fencing it.
> > - Zero-based virtual addresses for memory regions. This is mandatory
> > for iWARP and optional for IB (and is not required even for BMME).
> > I think the simplest thing to do is just to have yet another
> > capability bit to say whether a device supports ZBVA or not; all
> > iWARP devices can set it.
> Currently, nobody is using this nor the block mode feature. I don't
> think we should bother supporting them unless someone has an app in
> mind that will utilize them.
I agree that block mode seems dubious. I believe that iSER on iWARP
requires ZBVA though.
- R.
More information about the general
mailing list