[openib-general] osm unreliable unless -d1 (resolved)
Hal Rosenstock
halr at voltaire.com
Wed Mar 8 08:54:48 PST 2006
On Mon, 2006-03-06 at 18:44, Jean-Christophe Hugly wrote:
> Well, thanks guy !
>
> It looks like the spinlocks patch did the trick. The influence of the
> broken ref count was reaching further than expected, I guess.
There are VL15 counters which use atomics and one (outstanding) is used
to signal the state manager:
osm_vl15intf.c::__osm_vl15_poller
if ( p_madw->resp_expected == TRUE )
{
outstanding = cl_atomic_dec( &p_vl->p_stats->qp0_mads_outstanding );
osm_log( p_vl->p_log, OSM_LOG_DEBUG,
"__osm_vl15_poller: "
"%u QP0 MADs outstanding\n",
p_vl->p_stats->qp0_mads_outstanding );
if( outstanding == 0 )
{
...
If that signal didn't occur because outstanding were miscounted, that
might explain what you are seeing.
-- Hal
More information about the general
mailing list