[Openib-windows] RE: Complib changes to support OpenSM in OpenIB

Fab Tillier ftillier at silverstorm.com
Tue Aug 23 13:59:33 PDT 2005


> From: Eitan Zahavi [mailto:eitan at mellanox.co.il]
> Sent: Tuesday, August 23, 2005 1:15 AM
> 
> Hi Fab,
> 
> Please the my responses regarding OpenSM changes inline:

Sure, see below, and thank you for sending in plain text - it makes it much
easier for me to reply.

> Fab Tillier wrote:
> >
> >>-----Original Message-----
> >>From: Yossi Leybovich [mailto:sleybo at mellanox.co.il]
> >>
> >>1. cl_timers:
> >>        a.
> >>        we tried to call cl_timer_start from within the timer call back, and
> >>the thread enter to deadlock . We notice that cl_timer_star call   to
> >>cl_timer_stop which call to DeleteTimerQueueTimer with INVALI_HANDLE which
> >>cause it to wait to all the callbacks to finish.
> >
> >>        Solution to this problem is to add cl_timer_strat_cb function that
> >>wont call to timer stop and be able to call
> >>      from within the callback
> >
> >
> > Let's try to find a better solution.  Are you always resetting the event
> > from the callback?  Are you really looking for a periodic timer here,
> > rather than a one-shot timer?  If so, it would make sense to expose
> > periodic timers, rather than forcing the user to set a one-shot timer
> > repeatedly.
>
> In our (OpenSM) use the callback has to decide if a new event has to be
> scheduled or not. The time to the next event might be variable too. So
> making it a "repeating" timer (generating a series of events) is not
> really the required behavior.

Ok, that's what I wanted to find out.

> I wonder what is so wrong with adding an option for the timer or even a
> different callback such that it will not prevent us from obtaining the
> required functionality.

If the abstraction starts to expose so many OS specific behaviors and
workarounds, the user is better off interfacing with the OS directly rather than
using an abstraction that only provides pass through to the OS and obscures what
is actually going on.  So I'm trying to understand what you are doing so that
the abstraction is more than a quick hack to get things to work properly.

I think we can do something a little smarter in the implementation without
adding a new API - if we record what thread context we are in before invoking
the user callback (with GetCurrentThreadId).  We can then check to see if we are
in the same thread when we invoke start, and not invoke stop at all.  I'm at IDF
until Thursday, but I'll try to get something coded up so you can try it.  If
you code it yourselves before I get back in the office, please send the patch so
I don't duplicate the effort.

> >>2. cl_is_debug function - I send you patch for that , currently Linux
> >>version of OpenSM use it so we need it in our complib as well
> >>I don't think its too risky
> >
> > Runtime checks for linkage errors don't belong, especially in Windows
> > where we have distinct binary names for debug and release versions of
> > the same component.
> > If this is needed for Linux, put it in a #ifdef LINUX or something similar.
> > Long term, I suggest someone fix the Linux side of things to prevent such
> > linkage errors.
>
> I agree that we should fix the LINUX side. But as things tend to take
> long time to change in the windows area so they take time in the LINUX
> area. So for now we simply will make the windows main.c little more
> different from the LINUX main.c by removing the section that checks the
>     "debug"-ness of the used libraries.

Things do move slower than I wish - I wasn't saying fix the Linux side now,
rather just put a #ifdef for the check to only have it in Linux builds, and add
an item to the todo list for Linux.

> >>3. cl_qlock_pool.c - as Eitan said we need it in OpenSM for know and he will
> >>change the code to work with external locks
> >
> > As I said before, I'd like to see this file stay in OpenSM.  If OpenSM moves
> > away from using it, it can then be removed.  I don't want to add it to
> > complib
> > because I don't want any other ULPs to use it.  Further, the implementation
> > should be entirely inline functions since there's very little code there.
>
> Moving files around means changing "include" directives.
> It is not only a matter of the directory the file is part of.

You'll be moving files anyway, as well as updating include directives.  I don't
think qlockpool has much value, and don't want to encourage its use.  You
currently have a complib subdirectory to osm.  Keep that directory and put the
qlockpool file in there.  The same applies to the cl_event_wheel thing.

> Regarding the usage of inline functions - I do not see how it matters.
> It is just un-needed more work for me. Is there a simple rule you use
> for what has to be inline and what not?

If all the functions are just wrappers that make a very few things (like lock,
get, unlock), it's simpler to just put it into a header as an inline function.
This avoids the need to add a source file to the makefile, and lets things build
quickly and easily.

> If the code was all implemented in the header file will you agree to
> place it into the complib include directory?

I'd rather not, but would consider it if it was a header, though with the
following caveat: the functions should not be CL_INLINE but rather just static
inline.  Making them CL_INLINE will bloat the size of the complib DLLs.  I don't
want to see other ULPs use it, so I don't see a reason to bloat complib for a
single ULP.  Ideally, I'd like to see it stay in OSM, and eventually be
eliminated.

> That is exactly the case. The only exception is ib_copy_ca_attr which is
> a prototype that is actually never used and implemented.

You mean it isn't used or implemented in OSM, right?

Also, there was an issue about catching signals, and you were going to look at
using the SetConsoleCtrlHandler function, or even better eliminating that
functionality since the underlying access layers should handle the app exiting.
What is the status of this?

Thanks,

- Fab




More information about the ofw mailing list