[openib-general] [RFC] libibverbs completion event handling

Caitlin Bestler caitlinb at broadcom.com
Wed Sep 21 15:33:20 PDT 2005


I'm not sure I follow what a "completion channel" is.
My understanding is that work completions are stored in
user-accessible memory (typically a ring buffer). This 
enables fast-path reaping of work completions. The OS
has no involvement unless notifications are enabled.

The "completion vector" is used to report completion
notifications. So is the completion vector a *single*
resource used by the driver/verbs to report completions,
where said notifications are then split into user
context dependent "completion channels"?

The RDMAC verbs did not define callbacks to userspace
at all. Instead it is assumed that the proxy for user
mode services will receive the callbacks, and how it
relays those notifications to userspace is outside
the scope of the verbs.

Both uDAPL and ITAPI define relays of notifications
to AEVDS/CNOs and/or file descriptors. Forwarding
a completion notification to userspace in order to
make a callback in userspace so that it can kick
an fd to wake up another thread doesn't make much
sense. The uDAPL/ITAPI/whatever proxy can perform
all of these functions without any device dependencies
and in a way that is fully optimal for the usermode
API that is being used. For kernel clients, I don't
see any need for anything beyond the already defined
callbacks direct from the device-dependent code.

Even in the typical case where the usermode application
does an evd_wait() on the DAT or ITAPI endpoint, the
DAT/ITAPI proxy will be able to determine which thread
should be woken and could even do so optimally. It
also allows the proxy to implemenet Access Layer features
such as EVD thresholding without device-specific support.  

> -----Original Message-----
> From: openib-general-bounces at openib.org 
> [mailto:openib-general-bounces at openib.org] On Behalf Of Roland Dreier
> Sent: Wednesday, September 21, 2005 12:22 PM
> To: openib-general at openib.org
> Subject: [openib-general] [RFC] libibverbs completion event handling
> 
> While thinking about how to handle some of the issues raised 
> by Al Viro in <http://lkml.org/lkml/2005/9/16/146>, I 
> realized that our verbs interface could be improved to make 
> delivery of completion events more flexible.  For example, 
> Arlin's request for using one FD for each CQ can be 
> accomodated quite nicely.
> 
> The basic idea is to create new objects that I call 
> "completion vectors" and "completion channels."  Completion 
> vectors refer to the interrupt generated when a completion 
> event occurs.  With the current drivers, there will always be 
> a single completion vector, but once we have full MSI-X 
> support, multiple completion vectors will be possible.
> Orthogonal to this is the notion of a completion channel.  
> This is a FD used for delivering completion events to userspace.
> 
> Completion vectors are handled by the kernel, and userspace 
> cannot change the number of vectors that available.  On the 
> other hand, completion channels are created at the request of 
> a userspace process, and userspace can create as many 
> channels as it wants.
> 
> Every userspace CQ has a completion vector and a completion channel.
> Multiple CQs can share the same completion vector and/or the 
> same completion channel.  CQs with different completion 
> vectors can still share a completion channel, and vice versa.
> 
> The exact API would be something like the below.  Thoughts?
> 
> Thanks,
>   Roland
> 
> struct ibv_comp_channel {
> 	int			fd;
> };
> 
> /**
>  * ibv_create_comp_channel - Create a completion event 
> channel  */ extern struct ibv_comp_channel 
> *ibv_create_comp_channel(struct ibv_context *context);
> 
> /**
>  * ibv_destroy_comp_channel - Destroy a completion event 
> channel  */ extern int ibv_destroy_comp_channel(struct 
> ibv_comp_channel *channel);
> 
> /**
>  * ibv_create_cq - Create a completion queue
>  * @context - Context CQ will be attached to
>  * @cqe - Minimum number of entries required for CQ
>  * @cq_context - Consumer-supplied context returned for 
> completion events
>  * @channel - Completion channel where completion events will 
> be queued.
>  *     May be NULL if completion events will not be used.
>  * @comp_vector - Completion vector used to signal completion events.
>  *     Must be >= 0 and < context->num_comp_vectors.
>  */
> extern struct ibv_cq *ibv_create_cq(struct ibv_context 
> *context, int cqe,
> 				    void *cq_context,
> 				    struct ibv_comp_channel *channel,
> 				    int comp_vector);
> 
> /**
>  * ibv_get_cq_event - Read next CQ event
>  * @channel: Channel to get next event from.
>  * @cq: Used to return pointer to CQ.
>  * @cq_context: Used to return consumer-supplied CQ context.
>  *
>  * All completion events returned by ibv_get_cq_event() must
>  * eventually be acknowledged with ibv_ack_cq_events().
>  */
> extern int ibv_get_cq_event(struct ibv_comp_channel *channel,
> 			    struct ibv_cq **cq, void 
> **cq_context); _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 
> 




More information about the general mailing list