[openib-general] [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes
Steve Wise
swise at opengridcomputing.com
Fri Feb 9 07:25:45 PST 2007
> I understand, I did not get that.
>
> But for example create_read_req_cqe builds it in software.
> It could build ib_wc instead.
>
Reads are handled in a slightly different manner. This is due to the
fact that the T3 HW can complete a read out of order. For example:
POST READ
POST WRITE
The post read trigger the HW to send an RDMA_READ_REQUEST. Immediately
after that the HW can (and will) send the RDMA_WRITE. Once the peer TCP
ACKs the WRITE, the HW will post a CQE for the WRITE. That completion
might happen before the peer sends back the READ_RESPONSE. Since the
RDMAC verbs spec sez WRs must be completed in order, the T3 driver has
to deal with this. (and its painful :)
In addition, I have to maintain other state about a read. 1) the
consumer wr_id. For non reads, the wr_id is actually reflected back by
the HW from the WQE to the CQE. For reads, this doesn't happen. 2) the
CQE for a read completion doesn't contain the original length. I need
to pull that from the associated original WQE.
So all this means the driver needs to construct a proper read cqe from
several parts. That's why it creates it locally on the stack.
BUT:
You're right though: All WQEs get copied out of the HWCQ and into an
on-stack variable in iwch_poll_cq_one(). Removing this, however,
requires rethinking all the READ logic which assumes the WQE is copied
out of the HWCQ. Can cannot make this change right now because of
stability concerns (it took me long enough to understand how to
correctly handle the read case as it stands :-)
> ...
>
> > > Having to wade through 3 driver-specific layers of abstractions just because I want to
> > > for example change API in ib_verbs.h and need to update all drivers will be
> > > very taxing. I understand your design calls for 2 layers, but at least the API exposed
> > > by code in drivers/net is fairly small, while cxio_wr.h declares 27 structures
> > > which seem to just duplicate ib_verbs.h.
> >
> > cxio_wr.h is hw format. You want me to change ib_verbs.h to make WRs
> > and CQEs align with Chelsio hardware?
>
> No, but it need not be part of interface. The reason I was confused is because
> you seem to create an extra copy e.g. for t3_cqe. cxio_poll_cq currently
> creates an intermediate copy of the completion on the stack, I think it could
> format ib_wc directly instead.
>
I'll log this as a performance optimization that we can do later.
Thanks for helping review this stuff!!
Steve.
More information about the general
mailing list