[openib-general] Re: [PATCH] [SRP] support for it_iu length negotiation

Wed Nov 2 04:49:09 PST 2005

From: "Fab Tillier" <ftillier at silverstorm.com>
Sent: Tuesday, November 01, 2005 10:22 PM

>Even 350 bytes is a burden - imagine a target that supports a queue depth of
>1000 I/Os from a few dozen initators.  Ideally, I'd like to see us use just
>DDBDs and the 64-byte IU, along with registering the data buffers on a per-I/O
>basis, either via FMR or regular MRs.

Wouldn't a registering a MR per i/o kill performance? Right now, I believe,
the srp initiator registers all memory in as one region.

>> [as an aside, it sure would be nice if we could do an SRP-3 (since SRP-2
>> is dead) where multiple direct descriptors would be allowed. The only
>> way to get multiple descriptors now is with indirect descriptors.]

>That saves you 20 bytes - not a huge gain.

Yes but I wasn't clear. Allowing multiple direct descriptors would make it
reasonable for a target to not implement indirect descriptors at all. Presently target
implementers may be tempted to only partially implement indirect descriptors
by implementing partial descriptor list processing but not the actual 
indirect list. There is an argument that says that making iu's really big will
eliminate real indirect descriptors ( that is, indirect descriptors beyond the
partial list delivered in the iu) and make complete implementation (ie fetching
the rest of the list) of indirect descriptors unnecessary.

>> I am pretty sure that someone doing a video server might want to do, say,
>> 1MB i/o's. 1MB with 4KB pages means 256 descriptors and an iu of
>> something over 4096 bytes. I definitely don't want to be told by the srp
>> initiator that I need to use 4KB iu's. (So we agree there.)

>For large I/O, doing a registration of the buffer and sending a DDBD with a
>single descriptor might well provide the best performance.  If you look at the
>traffic on the wire, having the target do multiple page-sized RDMA operations is
>far less efficient than creating a virtual contiguous (to the target) region
>that a single RDMA operation can service.

Agreed. But I'm missing something (no doubt because I'm working on the embedded
target side, not the Linux side).  It looks like the srp initiator registers all of kernel
memory and does i/o from there. I'm not sure that an application can cause an 
arbitrarily large address-contiguous payload to appear on the wire. Probably I just
don't understand all of what is happening there.

Ken Jeffries