[openib-general] gen2/rnic-pi differences

Caitlin Bestler caitlinb at siliquent.com
Thu Jun 30 08:53:23 PDT 2005



This is a list of structural differences in between OpenIB's
gen2 verbs and RNIC-PI that will remain even after gen2 is made
"transport neutral".

After these distinctions are understood, a decision should be made as to
whether the differences represent different objectives and should be
merely documented for the  benefit of IHVs or whether their should be a
migration to end that specific difference.

For example, it might be decided that a given RNIC-PI feature was
inherently related to other Operating Systems.

The response might be to merely not the difference, or to determine when
Linux could support such a feature.

Through this discussion the term "kVP" is used to reference the
model/provider specific code that executes in the kernel, while "uVP"
references model/provider specific code that executes in user space.



Memory Registration / Lookups

	gen2 translates virtual memory registrations
	to physical lists before the kVP is invoked.

	RNIC-PI expects the virtual memory registration
	request to be passed to the kVP untranslated. The
	kVP then makes a callback to obtain address
	translations and to pin memory. Mapping and
	pinning may be performed as separate steps,
	allowing mapping of Consumer pinned memory.

	gen2 does not currently support registration of
	shared memory regions.



Locking

	gen2 does not have a clear statement about expection
	of who is responsible for preventing concurrent
	data access. Fastpath operations must be callable
	from within an interrupt or while holding a spinlock.
	Slowpath operations are allowed to block.

	It is not clear if gen2 would allow suppression of
	locking when the caller has taken responsibility for
	serializing all object access. RNIC-PI leaves division
	of that responsibility to be worked out between the
	Consumer and the Access Layer (DAT/IT-API).

	RNIC-PI allows allocating slowpath operations to block,
	but does not allow non-allocating slowpath operations or
	fastpath operations to block or stall indefinitely. For
	the latter cases it must be legal for the caller to hold
	a spinlock over the call.

	By default, RNIC-PI places responsibility for serializing
	access to an object on the caller. RNIC-PI has tentatively
	decided to allow a second optional set of verbs where the
	verb layer will provide serializations.



User-Mode Handles

	gen2 never exports kernel pointers to user-mode, but rather
	registers all such pointers as handles using standard routines.
	All handles passed back in from user-mode are validated as
	a by-product of translation back to kernel mode pointers.

	RNIC-PI assigns that responsibility to the kAL (Kernel Access
	Layer) but only requires *validation* of handles. It does not
	explicitly address *translation* nor placing them in a central
	registry.



os_data / Identification of Consumer Objects

	gen2 provides minimal support for identification of RDMA 
	resources using consumer supplied handles. A user-supplied
	context is available in callbacks, but not in work completions.

	RNIC-PI provides a general "os_data" capability that allows
	each RNIC-PI object to have a consumer supplied alias that
	is used for all queries (including on other objects), callbacks
	and work completions.

	The RNIC-PI approach can eliminate the need for reverse indexes
	or per work request tracking data by the verbs consumer (such as
	the "DTO_COOKIE" in the reference DAPL implementation).

	This is of greatest concern when reaping a work completion in 
	user mode, as that there is no way to translate a qp_num to a
	QP object in user mode. It isn't that easy in kernel mode when
	dealing with multiple vendors, either.


Work Request Opaques / Local Solicited / Threshold Solicited

	RNIC-PI provides "Work Request Opaques" that allow the verbs
	consumer (especially DAPL/IT-API) to mark certain work requests
	with pass-through flags rather than using a parallel data
structure
	such as the DTO_COOKIE.

	One of these flags allows a work request to be marked as 
	"local solicited", which will make it an urgent event (one
	justifying a completion notification callback) when it completes
	successfully (essentially setting the solicited bit locally).

	IHVs can support these bits a) not at all, b) as pass-thru
	or c) actually implement the Local Solicited semantics in
hardware.

	gen2 also provides an enhanced method for providing an earlier
	completion callback than provided for in the verbs, but it is
more
	akin to the DAT/IT-API evd threshold feature. RNIC-PI defines no
	such feature, partially because callbacks always occur in the
kernel.



kernel callbacks

	RNIC-PI only provides kernel callbacks. No callbacks are
provided
	in user mode. It is assumed that the callback routine is part of
	the kAL (Kernel Access Layer) and that it will co-ordinate
unblocking
	of EVD waiters with the uAL (User Access Layer). This allows
optimized
	handling of many callback scenarios where the net effect is to
kick
	a file descriptor, wake another thread or to take no immediate
action.
	It also avoids uVPs having to add callback support, which they
were
	not required to have under the RDMAC verbs.

	gen2 provides a standardized relay of callback notification
events
	from kernel mode to user mode.



ihv_data / model specific data

	RNIC-PI defines an opaque pointer that can be used to
communicate
	model-specific data between the uVP and kVP.

	gen2 allows vendors to add extra bytes both IN and OUT to each
	verb request / response communicated over the user context fd.

	The same information can be communicated effectively using 
	either approach.



user / kernel communications

	gen2 creates a file descriptor for each open RDMA Device
instance
	(i.e, per device per client).

	RNIC-PI does not define how the sysCall is implemented, but
implies
	that there is one per client (no matter how many devices).



Additional error information

	gen2 provides for vendor specific error information in work
completions.

	RNIC-PI provides for additional OS-specific error reporting
through the
	'err_data' opaque in a variety of contexts. But it is
OS-specific.





STag0 (or equivalent)

	RNIC-PI has each rnic define a pre-existing
	"all physical memory" memory region (STag0
	for iWARP).

	gen2 provides a verb for the kVP to create
	such a memory region.

	Equivalent results can be achieved with
	either interface, and equivalent support
	from the kVP is required in either case.


Doorbells

	gen2 defines a standard method for mapping
	the doorbell.

	RNIC-PI presumes that this will be solved
	between the uVP and kVP and that no standard
	interface is required.


peek_cq

	gen2 defines a method to peek at a cq (see
	if there are more entries there without
	attempting to reap a completion).

	RNIC-PI does not define this, but it should
	be feasible for almost all implementations.





More information about the general mailing list