[Openib-windows] RE: Errors handeling on Winsock direct

Thu Oct 27 09:44:27 PDT 2005

Hi Tzachi,

> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> Sent: Thursday, October 27, 2005 9:01 AM
> 
> While searching for errors on windows direct, we have found out that error
> handling of ib_poll_cq is not done correctly.
> This is done on the function: ib_cq_comp().
> 
> As you can see, this function after calling ib_poll_cq() has a case
> statement that only handles the case of IB_INVALID_CQ_HANDLE. Beside
> printing it doesn't handle any other errors.

There isn't much need to handle other errors explicitly - in case of any other
errors, the done_wc_list pointer should be NULL and things should fall through
just fine.  I also think that the code has changed enough that the
IB_INVALID_CQ_HANDLE case will never be hit, and I will look at eliminating it
some day.

> As a meter of fact the problem is harder, since the function ib_cq_comp
> doesn't have a way to return an error. Please note that returning 0 is not
> enough because 0 is a legal option if nothing was found. As a result, there
> is a need to change all the callers of this function, so that the error will
> be propagated to there caller.

What will the callers do with the propagated value?  Would all cases of errors
on the CQ be accompanied by async affiliated error notification for that CQ?
What about for the QP?

> This also brings a new issue: Error printing in release: I believe that we
> should change our printing method so that it will also be possible to print
> in release mode. Otherwise, if a test is failing there is no way to know
> where the failure is.

Output to the debugger must be limited to debug releases only.  However, we can
and should add logging of errors to the application event log so that there's a
trace.  Using the event log mechanism also eliminates the need to have a debug
message viewer attached at the time of the error.

- Fab