[openib-general] Location for iser-target code
Mike Christie
michaelc at cs.wisc.edu
Mon Apr 10 11:20:56 PDT 2006
Mike Christie wrote:
> Alexander Nezhinsky wrote:
>> Mike Christie wrote:
>>>>> tgt/stgt is probably two frameworks from your point of view. There
>>>>> is a
>>>>> kernel part for target LLDs to hook into. The kernel part is
>>>>> similar to
>>>>> scsi-ml, actually it builds onto it and uses some of the scsi-ml
>>>>> functions, and provides code to share for tasks like creating scatter
>>>>> lists and mapping commands between the kernel and userspace. The
>>>>> target
>>>>> LLD basically handles lower level issues like DMAing the data,
>>>>> transport
>>>>> issues, etc, pretty much what a scsi-ml initiator driver does. For
>>>>>> iscsi, the tgt lld performs similar tasks as the initiator. It parses
>>>>>> the iscsi PDUs or puts them on the interconnect, handles session and
>>>>>> connection manamgement (this would be done like open-iscsi
>>>>>> though), but
>>>>>> then then passes the scsi command to tgt's kernel code.
>>>>>>
>>>>>> The other part of the framework is the userspace component. The tgt
>>>>>> kernel component basically passes scsi commands and task management
>>>>>> functions to a userspace daemon. The daemon contains the scsi state
>>>>>> machine and execute the IO. When it is done it informs the the kernel
>>>>>> component which in turn maps the data into the kernel, forms scatter
>>>>>> lists, and then passes them to the target LLD to send out.
>> In the cited paper's abstract you wrote:
>> > In order to provide block I/O services, users have had to use
>> modified kernel code,
>> > binary kernel modules, or specialized hardware. With Linux now
>> having iSCSI,
>> > Fibre Channel, and RDMA initiator support, Linux target framework
>> (tgt) aims to
>> > fill the gap in storage functionality by consolidating several
>> storage target drivers...
>>
>> So i guess one of the added values (if not the main one) of
>> implementing the entire
>> scsi command interface of tgt in userspace is gaining easy access to
>> block I/O drivers.
>> But the block I/O subsystem has a clear intra-kernel interface. If the
>> kernel part of
>
> Which interface are you referring to? bio, REQ_PC or REQ_BLOCK_PC, or
> read/write so you can take advantage of the kernel cache?
>
>> tgt would anyway allocate memory and build the scatter-gather lists,
>> it could pass the
>> commands along with the buffer descriptors down to the storage
>> stack, addressing
>> either the appropriate block I/O driver or scsi-ml itself. This extra
>> code should be
>> quite thin, it uses only existing interfaces, makes no modification to
>> the existing
>> kernel code.
>
> Some of those options above require minor changes or you have to work
> around them in your own code.
>
> The user space code can do all the administration stuff,
>> and specifically
>> choosing the right driver and passing to the kernel part all necessary
>> identification
>> and configuration info about it.
>
>
> Actually we did this, and it was not acceptable to the scsi maintainer:
>
> For example we could send IO by:
>
> 1. we use the sg io kernel interfaces to do a passthrough type of
> interface.
>
> 2. read/write to device from kernel
>
> If you look at the different trees on that berili site you will see
> different versions of this. And it sends up being the same amount of
> code. See below.
>
>>
>> Are there other reasons for pushing SCSI commands from kernel to user
>> space and
>> performing them from there?
>>
>
> By pushing it to userspace you please the kernel reviewers and there is
> not a major difference in performance (not that we have found yet). Also
> when pushing it to userspace we use the same API that is used to execute
> SG_IO request and push its data between the kernel and userspace so it
> is not like we are creating something completely new. Just hooking up
> some pieces. The major new part is the netlink interface which is a
> couple hundred lines. Some of that interface is for management though.
>
That is not really answering your question... Besides the kernel
reviewers, the problem with using REQ_PC or REQ_BLOCK_PC and bios is
that you cannot reuse the kernel's caching layer. By doing it in
userspace you can do a mmap, advantage of the kernels caching code and
it is async. To do the same thing in the kernel you have to either
create a thread to do each read/write, hook in the async read/write
interface in the kernel (which may be nicer to do now, but was not when
we looked at it), or implement your own cache layer and I do not think
that would be easy to merge.
More information about the general
mailing list