[openib-general] Re: [PATCH] RFC: opensm: serialize osm_state_mgr_process()

Sasha Khapyorsky sashak at voltaire.com
Mon May 22 10:56:37 PDT 2006


On 14:44 Mon 22 May     , Eitan Zahavi wrote:
> Hi Sasha,
> 
> > >
> > > The idea to use pthread in OpenSM code is totally wrong.
> > > Please stop doing this. We want this code to be shared with Windows
> and
> > > this breaks it.
> > 
> > The problem is that complib does not provide needed primitives, mostly
> > in synchronization and thread management areas (like
> pthread_cond_wait()
> > and friends, pthread_cancel() and friends).
> [EZ] The problem is that any code you write in OpenSM should be portable
> to existing complib such that it would be portable to windows.
> If some thread interface is missing (I am surprised we could get so far
> without that primitive) - please add to complib and also make sure
> Windows native threads can support that primitive.

This will not help to Windows anyway (just additional works on our
side). complib on linux has pthread as backend, any new complib wrapper
will require a works on windows side.

> > The option to extend complib also does not look very helpful for
> Windows
> > too - complib uses pthreads as backend anyway.
> [EZ] No the windows complib does not use pthread at all.

linux's does.

> > Is using of pthread library with Windows may solve sharing issue?
> [EZ] It might if you are willing to go through this un-needed exercise
> ...
> We have a complib in windows that is well tested and functional today. 
> I do not see why you need to re-write it.
> > 
> > Other option I may think about is to use pthread wrapper (in the same
> > way as it is done today with complib).
> [EZ] This does not solve the need to implement the same functionality in
> windows.
> And since I do not want to re-implement complib for windows - I prefer
> sticking with complib wrappers.

And why new complib wrapper will be better (or simpler) than new pthread
wrapper? (pthread at least is much more standard and generally popular).

> > 
> > >
> > > Also please provide a clear RFC for what this patch is trying to do.
> > 
> > Basically this serializes execution of osm_state_mgr_process(), so
> > instead of to be directly called via dispatcher's callback (with
> possible
> > waiting for the lock), state_manager will be signaled (and wakeuped if
> > necessary), as result we don't need big state_lock anymore, mutex is
> > needed only to protect signal_mask (opensm's osm_signal_t, not *nix
> > signals) update.
> [EZ] I do not see what problem you are trying to solve? 

from original post:

This serializes execution of osm_state_mgr_process() and removes the big
state_lock. This should reduce "locked state" time and prevent potential
dispatcher blocking.

> What is wrong with the current implementation?

In addititon to above:
There is impossible to know reliably is there pended signals or
transactions, etc.
In general the current implementation looks 

> Are you fixing a bug?

Mostly this patch was born yet in period of deadlocking (on the big
state_lock due to broken signal processing) and endless blocking (due to
bad broken mad counters). And we discussed several needed improvements.
I've just pended this stuff until nearly end of 1.0 cycle. Other obvious
thing we will need is to resolve cl_atomic() workaround which was done
then.

> The dispatching mechanism used allows passing the state manager signals
> which must be processed in order. The state_lock guarantees this order.

This is done as complib's cl_spinlock() and finally via pthread_mutex.
Does pthread_mutex_lock() guarantees the order?

Sasha



More information about the general mailing list