[openib-general] osm unreliable unless -d1 (resolved)

Hal Rosenstock halr at voltaire.com
Wed Mar 8 08:54:48 PST 2006


On Mon, 2006-03-06 at 18:44, Jean-Christophe Hugly wrote: 
> Well, thanks guy !
> 
> It looks like the spinlocks patch did the trick. The influence of the
> broken ref count was reaching further than expected, I guess. 

There are VL15 counters which use atomics and one (outstanding) is used
to signal the state manager:

osm_vl15intf.c::__osm_vl15_poller

        if ( p_madw->resp_expected == TRUE )
        {
          outstanding = cl_atomic_dec( &p_vl->p_stats->qp0_mads_outstanding );

          osm_log( p_vl->p_log, OSM_LOG_DEBUG,
                   "__osm_vl15_poller: "
                   "%u QP0 MADs outstanding\n",
                   p_vl->p_stats->qp0_mads_outstanding );

          if( outstanding == 0 )
          {
		...

If that signal didn't occur because outstanding were miscounted, that
might explain what you are seeing.

-- Hal




More information about the general mailing list