[ofw][patch][ibal] make IBBUS start up synchronous.
Leonid Keller
leonid at mellanox.co.il
Wed Mar 25 10:04:18 PDT 2009
And what timeout would you set ? 3 seconds ?
How does one understand 3 seconds is too long or too short? :)
The expected time of PnP handling 300-400 msec, so i think, 3 seconds
has to be enough.
One can change the fix, returning error, when after 3 seconds reference
counter was not decreased.
You are right, one could reach the same goal using events.
For example, add an event to CI_CA object, which is a part of the event
structure, and post the event at the end of the callback
(__pnp_process_add_ca).
________________________________
From: Smith, Stan [mailto:stan.smith at intel.com]
Sent: Wednesday, March 25, 2009 6:23 PM
To: Leonid Keller; ofw at lists.openfabrics.org
Subject: RE: [ofw][patch][ibal] make IBBUS start up synchronous.
Hello Leonid,
A small question. If the ADD_CA event does indeed signal
completion of HCA startup, why not have fdo_start() wait on an HCA guid
specific event (time-out enabled escape hatch) where the ADD_CA event
callback would then trigger the HCA guid specific event thus completing
fdo_start()?
If the event wait times out, then fdo_start() returns an error.
Bottom-line: over the years I have found waiting in spin loops
to be problematic. How does one understand 3 seconds is too long or too
short?
What am I not understanding about this synchronization problem?
thanks,
stan.
________________________________
From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Leonid Keller
Sent: Wednesday, March 25, 2009 7:02 AM
To: ofw at lists.openfabrics.org
Subject: [ofw][patch][ibal] make IBBUS start up synchronous.
IBAL start up has by design two stages.
- initialization and preparing for getting IBAL PnP events
(synchronous on fdo_start);
- completing the start up upon receiving ADD_CA event
(asynchronous, after fdo_start).
It means, that IBBUS driver notifies the completion of the
driver start operation before it really has finished it.
It causes races in quick HCA disable/enable sequences.
This patch makes IBBUS start up synchronous by adding a wait in
the first stage till the end of the second stage.
Index: core/al/kernel/al_pnp.c
===================================================================
--- core/al/kernel/al_pnp.c (revision 2055)
+++ core/al/kernel/al_pnp.c (working copy)
@@ -1152,6 +1152,7 @@
cl_status_t cl_status;
al_pnp_ca_event_t *p_event_rec;
ib_ca_attr_t *p_old_ca_attr;
+ uint32_t ref_cnt, prev_cnt, i;
AL_ENTER( AL_DBG_PNP );
@@ -1232,9 +1233,18 @@
}
/* Queue the event to the async processing manager. */
- ref_al_obj( &gp_pnp->obj );
+ ref_cnt = ref_al_obj( &gp_pnp->obj );
cl_async_proc_queue( gp_async_pnp_mgr,
&p_event_rec->async_item );
+ /* wait up to 3 seconds for the end of event propagation
+ It is needed for enabling quick HCA disable/enable scenarios.
*/
+ prev_cnt = ref_cnt - 1;
+ for ( i = 0; i < 30; ++i ) {
+ if (ref_cnt <= prev_cnt)
+ break;
+ cl_thread_suspend(100);
+ }
+
AL_EXIT( AL_DBG_PNP );
return IB_SUCCESS;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090325/0de9a3ba/attachment.html>
More information about the ofw
mailing list