[ofw] ConnectX functionality is completely broken

Sean Hefty sean.hefty at intel.com
Fri Aug 1 10:32:58 PDT 2008


To recall, 1435 patch has improved event notification mechanism for cq, qp and
srq objects.

I found one problem in the patch, which repeats itself for all three objects and
for both drivers: new event handlers get the old (and wrong) context values.

The new (and right) context values are nor used. As a result, IBAL callbacks are
called with wrong handle parameter, which ends up with crash.

 

I tested the patch below on mthca, and it's working fine for me.  If anyone
knows how to force the CQ, QP, or SRQ async events with an existing test, let me
know and I will run it.  I do have one comment below:

 

 

Index: hw/mthca/kernel/hca_verbs.c
===================================================================
@@ -906,12 +904,8 @@
   goto err_create_srq;
  }
 
- // fill the object
- srq_p = (struct mthca_srq *)ib_srq_p;
- srq_p->srq_context = (void*)srq_context;
- 
  // return the result
- if (ph_srq) *ph_srq = (ib_srq_handle_t)srq_p;
+ if (ph_srq) *ph_srq = (ib_srq_handle_t)ib_srq_p;



ph_srq isn't really optional here.  If one isn't provided, we end up leaking
memory.  Personally, I would just remove the if check, but it could also be
moved to the top of the function with a failure return if the output parameter
is not provided.

 

Similar checks are provided in _create_qp() and mthca_create_cq().

 

- Sean

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20080801/ab0f9272/attachment.html>


More information about the ofw mailing list