[ofw][patch][ibal] make IBBUS start up synchronous.

Smith, Stan stan.smith at intel.com
Wed Mar 25 09:23:11 PDT 2009


Hello Leonid,
  A small question. If the ADD_CA event does indeed signal completion of HCA startup, why not have fdo_start() wait on an HCA guid specific event (time-out enabled escape hatch) where the ADD_CA event callback would then trigger the HCA guid specific event thus completing fdo_start()?
If the event wait times out, then fdo_start() returns an error.

Bottom-line: over the years I have found waiting in spin loops to be problematic. How does one understand 3 seconds is too long or too short?

What am I not understanding about this synchronization problem?

thanks,

stan.



________________________________
From: ofw-bounces at lists.openfabrics.org [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Leonid Keller
Sent: Wednesday, March 25, 2009 7:02 AM
To: ofw at lists.openfabrics.org
Subject: [ofw][patch][ibal] make IBBUS start up synchronous.

IBAL start up has by design two stages.
  - initialization and preparing for getting IBAL PnP events (synchronous on fdo_start);
  - completing the start up upon receiving ADD_CA event (asynchronous, after fdo_start).

It means, that IBBUS driver notifies the completion of the driver start operation before it really has finished it.
It causes races in quick HCA disable/enable sequences.

This patch makes IBBUS start up synchronous by adding a wait in the first stage till the end of the second stage.


Index: core/al/kernel/al_pnp.c
===================================================================
--- core/al/kernel/al_pnp.c (revision 2055)
+++ core/al/kernel/al_pnp.c (working copy)
@@ -1152,6 +1152,7 @@
  cl_status_t   cl_status;
  al_pnp_ca_event_t *p_event_rec;
  ib_ca_attr_t  *p_old_ca_attr;
+ uint32_t   ref_cnt, prev_cnt, i;

  AL_ENTER( AL_DBG_PNP );

@@ -1232,9 +1233,18 @@
  }

  /* Queue the event to the async processing manager. */
- ref_al_obj( &gp_pnp->obj );
+ ref_cnt = ref_al_obj( &gp_pnp->obj );
  cl_async_proc_queue( gp_async_pnp_mgr, &p_event_rec->async_item );

+ /* wait up to 3 seconds for the end of event propagation
+ It is needed for enabling quick HCA disable/enable scenarios. */
+ prev_cnt = ref_cnt - 1;
+ for ( i = 0; i < 30; ++i ) {
+  if (ref_cnt <= prev_cnt)
+   break;
+  cl_thread_suspend(100);
+ }
+
  AL_EXIT( AL_DBG_PNP );
  return IB_SUCCESS;
 }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090325/c63e625c/attachment.html>


More information about the ofw mailing list