[openib-general] [PATCH 03/18] [RFC] Provider Registration and Methods

Tue Mar 7 13:39:26 PST 2006

On Tue, 2006-03-07 at 12:56 -0800, Roland Dreier wrote:
>  > +static int iwch_process_mad(struct ib_device *ibdev,
>  > +			    int mad_flags,
>  > +			    u8 port_num,
>  > +			    struct ib_wc *in_wc,
>  > +			    struct ib_grh *in_grh,
>  > +			    struct ib_mad *in_mad, struct ib_mad *out_mad)
>  > +{
>  > +	PDBG("%s:%s:%u\n", __FILE__, __FUNCTION__, __LINE__);
>  > +	return -ENOSYS;
>  > +}
> 
> I'd rather fix the core so that this function (and all the other
> -ENOSYS stubs) never get called, rather than forcing low-level drivers
> to provide stubs.  How hard is that to do?
> 

Well, we could add an ib_process_mad() call that does the ENOSYS check,
and then change mad.c and sysfs.c to call ib_process_mad() instead of
the direct provider call.  

The attach_mcast() stub can be removed because its only called by
ib_attach_mcast() and that does the NULL check...

ib_create_ah() doesn't check for a NULL provider func... but could.

and ib_modify_port() doesn't check either.   

Some of this ties into the transport-specific method discussion because
mad and ah methods are IB-specific.  And the mcast methods are for now
IB-specific. But regardless of that discussion, we could do the above
changes.

Also, in core/device.c there's this table of mandatory verbs.  And it
includes create_ah and destroy_ah.  So to these need to not be mandatory
as well.

If folks agree to the above approach, then I'll make a patch for the
core changes and post it.

>  > +	/*
>  > + 	 * Attempt to make the CQ big enough to handle the T3
>  > +	 * additional CQE possibilities:  
>  > +	 * 	TERMINATE, 
>  > +	 * 	2 CQES for each RDMA READ operation,
>  > +	 *	incoming RDMA READ REQUEST FAILUREs
>  > + 	 * We can make the CQ big enough to handle these for
>  > +	 * a single QP.  But problems can arise if the CQ is shared...
>  > +	 */
> 
> Is there a plan for how to handle this?  It seems really bad that a
> consumer could create a CQ that it thinks is big enough to handle all
> the outstanding work requests that it might post, but then still have
> the CQ overrun because of internal implementation details.
> 
>  - R.

Yup.  I need to address this, but I don't quite know how yet.  I'm
discussing this now with the chelsio folks to see what we can do. 

Steve.