[openib-general] Location for iser-target code

Mon Apr 10 11:20:56 PDT 2006

Mike Christie wrote:
> Alexander Nezhinsky wrote:
>> Mike Christie wrote:
>>>>> tgt/stgt is probably two frameworks from your point of view. There 
>>>>> is a
>>>>> kernel part for target LLDs to hook into. The kernel part is 
>>>>> similar to
>>>>> scsi-ml, actually it builds onto it and uses some of the scsi-ml
>>>>> functions, and provides code to share for tasks like creating scatter
>>>>> lists and mapping commands between the kernel and userspace. The 
>>>>> target
>>>>> LLD basically handles lower level issues like DMAing the data, 
>>>>> transport
>>>>> issues, etc, pretty much what a scsi-ml initiator driver does. For
>>>>>> iscsi, the tgt lld performs similar tasks as the initiator. It parses
>>>>>> the iscsi PDUs or puts them on the interconnect, handles session and
>>>>>> connection manamgement (this would be done like open-iscsi 
>>>>>> though), but
>>>>>> then then passes the scsi command to tgt's kernel code.
>>>>>>
>>>>>> The other part of the framework is the userspace component. The tgt
>>>>>> kernel component basically passes scsi commands and task management
>>>>>> functions to a userspace daemon. The daemon contains the scsi state
>>>>>> machine and execute the IO. When it is done it informs the the kernel
>>>>>> component which in turn maps the data into the kernel, forms scatter
>>>>>> lists, and then passes them to the target LLD to send out.
>> In the cited paper's abstract you wrote:
>>  > In order to provide block I/O services, users have had to use 
>> modified kernel code,
>>  > binary kernel modules, or specialized hardware. With Linux now 
>> having iSCSI,
>>  > Fibre Channel, and RDMA initiator support, Linux target framework 
>> (tgt) aims to
>>  > fill the gap in storage functionality by consolidating several 
>> storage target drivers...
>>
>> So i guess one of the added values (if not the main one) of 
>> implementing the entire
>> scsi command interface of tgt in userspace is gaining easy access to 
>> block I/O drivers.
>> But the block I/O subsystem has a clear intra-kernel interface. If the 
>> kernel part of
> 
> Which interface are you referring to? bio, REQ_PC or REQ_BLOCK_PC, or 
> read/write so you can take advantage of the kernel cache?
> 
>> tgt would anyway allocate memory and build the scatter-gather lists, 
>> it could pass the
>> commands  along with the  buffer descriptors down to the storage 
>> stack, addressing
>> either the appropriate block I/O driver or scsi-ml itself. This extra 
>> code should be
>> quite thin, it uses only existing interfaces, makes no modification to 
>> the existing
>> kernel code.
> 
> Some of those options above require minor changes or you have to work 
> around them in your own code.
> 
> The user space code can do all the administration stuff,
>> and specifically
>> choosing the right driver and passing to the kernel part all necessary 
>> identification
>> and configuration info about it.
> 
> 
> Actually we did this, and it was not acceptable to the scsi maintainer:
> 
> For example we could send IO by:
> 
> 1. we use the sg io kernel interfaces to do a passthrough type of 
> interface.
> 
> 2. read/write to device from kernel
> 
> If you look at the different trees on that berili site you will see 
> different versions of this. And it sends up being the same amount of 
> code. See below.
> 
>>
>> Are there other reasons for pushing SCSI commands from kernel to user 
>> space and
>> performing them from there?
>>
> 
> By pushing it to userspace you please the kernel reviewers and there is 
> not a major difference in performance (not that we have found yet). Also 
> when pushing it to userspace we use the same API that is used to execute 
> SG_IO request and push its data between the kernel and userspace so it 
> is not like we are creating something completely new. Just hooking up 
> some pieces. The major new part is the netlink interface which is a 
> couple hundred lines. Some of that interface is for management though.
> 

That is not really answering your question... Besides the kernel 
reviewers, the problem with using REQ_PC or REQ_BLOCK_PC and bios is 
that you cannot reuse the kernel's caching layer. By doing it in 
userspace you can do a mmap, advantage of the kernels caching code and 
it is async. To do the same thing in the kernel you have to either 
create a thread to do each read/write, hook in the async read/write 
interface in the kernel (which may be nicer to do now, but was not when 
we looked at it), or implement your own cache layer and I do not think 
that would be easy to merge.