[openib-general] Re: [PATCH] [SRP] support for it_iu length negotiation

Fab Tillier ftillier at silverstorm.com
Tue Nov 1 20:22:14 PST 2005


> From: Kenneth L Jeffries [mailto:kenjeffries at austin.rr.com]
> Sent: Tuesday, November 01, 2005 4:45 PM
> 
> > My objections are the following (as I said in my previous mail):
> >  - I don't like allocating a 1 KB IU for every send IU, since most of
> >    that memory will probably never be used.
> >  - I'm not convinced that it's _ever_ a win to have the target do
> >    another RDMA to fetch the indirect buffer list.  You need to
> >    convince me that it's not better to simply tell the upper layers
> >    what the limit on s/g list length is to fit in the current IU size.
> 
> I also don't want to allocate 1KB IU's. If IU's were fixed size, I'd want
> (probably, depending on performance testing) a fixed size of 350 bytes
> (from Fab Tiller's 64KB i/o, 4KB pages, Windows) or possibly even
> the mininum DDBD (as Fab Tiller also says).  1KB IU's with thousands
> of RC's causes me a lot of wasted space heartburn.

Even 350 bytes is a burden - imagine a target that supports a queue depth of
1000 I/Os from a few dozen initators.  Ideally, I'd like to see us use just
DDBDs and the 64-byte IU, along with registering the data buffers on a per-I/O
basis, either via FMR or regular MRs.

> [as an aside, it sure would be nice if we could do an SRP-3 (since SRP-2
> is dead) where multiple direct descriptors would be allowed. The only
> way to get multiple descriptors now is with indirect descriptors.]

That saves you 20 bytes - not a huge gain.

> I am pretty sure that someone doing a video server might want to do, say,
> 1MB i/o's. 1MB with 4KB pages means 256 descriptors and an iu of
> something over 4096 bytes. I definitely don't want to be told by the srp
> initiator that I need to use 4KB iu's. (So we agree there.)

For large I/O, doing a registration of the buffer and sending a DDBD with a
single descriptor might well provide the best performance.  If you look at the
traffic on the wire, having the target do multiple page-sized RDMA operations is
far less efficient than creating a virtual contiguous (to the target) region
that a single RDMA operation can service.

- Fab




More information about the general mailing list