[ofa-general] Multi-threaded diags (Was: Re: [PATCH 4/5] infiniband-diags/libibnetdisc: Introduce a context object.)

Fri Sep 18 15:22:22 PDT 2009

On Thu, 27 Aug 2009 12:20:56 -0600
Jason Gunthorpe <jgunthorpe at obsidianresearch.com> wrote:

> On Thu, Aug 27, 2009 at 09:48:10AM -0700, Ira Weiny wrote:
> 
> > > FSM multiplexing the recv path usually gives much better performance,
> > > something like net discovery is quite easy..
> > 
> > Using the original algorithm and data structures lended itself to
> > threading.  Now that I am neck deep in all this I have thought that
> > rewriting it all might be easier.
> 
> Yah. mayhaps..
> 
> > > main loop:
> > >  fill tx queue from next list
> > >  recieve replies and correlate with next list
> 
> > This would still need additional code (or additional synchronization in the
> > API to libibnetdisc) if you wanted a user app to be multi-threaded.  Someone
> > has to be in charge of receiving all replies on that ibmad_port object and
> > handing them to the proper owner.  Of course one could open multiple
> > ibmad_port objects but how is the app writer to know to do that?  Digging
> > through the code to find out that libibnetdisc is consuming all the replies?
> 
> What is the use case here? I thought the app would be something like:
> 
> main()
> {
>   foo = libibnetdisc_setup();
>   libibnetdisc_discover_all(foo,res);
>   // Do interesting things with res.
> } 

That is the current use case.  However I can see use cases were discover is
called periodically to get a new snapshot of the fabric.  Also since the
discover can scan parts of the fabric ("libibnetdisc_discover_part") and
return a fabric which represents pieces of the whole I could see "fabric"
operations such as merge, update, and replace.

> 
> Where the goal is to have libibnetdisc_discover_all complete
> expediently.
> 
> As long as the context 'foo' is re-entrant in all ways with all other
> libraries and contexts I think useful threaded apps can be created.

Yes absolutely.  However, my current issue is with making ibmad_port thread
safe so that libibnetdisc_discover can be multithreaded.  I have been able to
do so but the amount of code it took seems unreasonable to force upon any
users ob libibmad.

> 
> > This is what got me on this in the first place.  smp_query_via
> > (_do_madrpc) is not thread safe. 
> 
> Sure, the entire library is not thread safe around the ibmad_port
> context. But who cares? If the caller to libibnetdisc wants to thread
> that way they need to open another context.

Yes, they can but how do they know they need to do this?  Furthermore how many
context's are required?  The bottom line is I wanted multiple outstanding
queries.  I am not going to open a context for each query.  The amount of code
required to process and sort Transaction ID's should be provided by libibmad
or a layer at that level.  It should not be required for every user process or
user lib.  Furthermore my prototype code does not support redirect.  Therefore
it makes the code even more difficult.  Why make every user suffer this
problem?

> 
> > Also, I feel that someone down the road might fall into the same
> > trap that I did thinking that smp_query_via is thread safe and I
> > would like to fix that.
> 
> Well.. How can it be threaded? umad_send/umad_recv are inherently
> single threaded APIs. You have to layer a TID based threading dispatch
> mechanism on top of it. Much better to let the kernel do that and open
> multiple umad fds.

I am a bit confused.  Do you mean to open multiple umad fds such that the
kernel will do the TID based dispatch for you?  Or are you suggesting a
different kernel umad implementation?

>  
> > > each entry:
> > >  add to next list additional ports
> > > 
> > > Repeat until dead.
> > > 
> > > Where a 'next list' would be a set of actions along the lines of
> > > 'query node' or 'query port' the action on a 'query node' completion
> > > is to generate 'query port' next list items for all the ports, and on
> > > 'query port' completion is to generate 'query node' items for all
> > > enabled ports..
> > > 
> > > libumad is nonblocking, parallel, etc...
> > 
> > Yes, and libibmad layers on top of it an easier interface to issue common
> > queries.  Why should we ask the user to re-implement that code?
> 
> Well, the very best way to do this is to have a FSM engine API at the
> core of the MAD libary:
>   mad_ctx->callback = done_this;
>   mad_post(mad,mad_ctx)
> 
> done_this(reply):
>   ...

Which way do you propose to do this, have a thread calling "done_this" or
having the user call an event loop?

> 
> > For example, mad_rpc now handles redirection.  My implementation
> > does not yet.  So now I have to handle that on my own as well...
> > :-(
> 
> To be honest, I don't like the libibmad/libibumad APIs one bit - I'm
> not surprised they don't work for you..
> 
> Frankly, we really need a usable MAD libary with sane APIs, and very
> high level APIs on top of that. You cannot make an IB application
> without doing SA queries at a minimum and the current process is
> HORRID.
> 
> I see nothing of value in libimad and libibumad to support that :|

I see some things of value in libibmad.  However, I have been reluctant to use
it in the past and I agree it needs fixing.  I don't want to reinvent the
wheel but perhaps that is what needs to be done...

Ira

-- 
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2 at llnl.gov