[Openframeworkwg] source code for proof of concept framework available

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Mon Dec 9 13:48:39 PST 2013


On Mon, Dec 09, 2013 at 09:16:30PM +0000, Hefty, Sean wrote:

> > - I always hated that sockets exposed the network byte order
> >   to apps, IMHO that should be avoided unless it is on a high speed
> >   path ..
> 
> I'm indifferent myself, but I do care that the interface clearly
> conveys which order the data is used (for immediate data, rkeys,
> etc.)

It is a tricky bit, but, eg rkeys are integers they should always be
in host order in the API calls, immediate data is accessed as a
uint32, not a void * so it should be in host order, etc.

Byte swaps for that kind of stuff is just more places where people can
make an error and never detect it until they test on a BE/LE mix.

I like the rule if a library presents something as a uintxx_t then it
is host order. If it needs to present something as network order then
it is binary data and it is a void *.

> > - fe_errno.h ... either have sane library
> >   specific unique error constants and xlate the kernel syscall errnos
> >   to the correct context specific error code,
> >   or stick with POSIX errno codes everwhere..
> 
> My main issue with errno values are that practically everything gets
> mapped to EINVAL, and ends up being useless for debugging.  So, I
> went with an 'extended' errno scheme.  I use errno values where
> reasonable, but allow for additional values beyond those defined in
> errno.h.  I also added places for vendor error codes.

Yes, that is what I guessed you were doing. I personally would xlate.

So call the kernel's ibv_foo_bar and then if it fails translate the
errno into something sane and context specific..

And yes, the kernel side has problems here lacking well defined
errnos, but that can be incrementally fixed up, IMHO.

Or said another way - either the error code is only to be passed to
fi_strerror() and may as well be opaque to application, or it is well
defined and every single call has a well defined list of errors it is
allowed to return and what they mean - consider POSIX does this for
every single POSIX system call.

A reasonable path might be to have them be opaque and then define and
specify as a need is discovered.

> > - fi_atomic and fl_arch can probably use gcc intrinsics - introducing the
> >   multi-threaded memory model in C++11 caused gcc to gain a full set
> >   of memory barrier and atomic builtins for C code.
> 
> fi_arch was simply copied from libibverbs -- I haven't done anything
> with it.  It should be removed or moved internal for provider use
> only. 

Right

> fi_atomic is a placeholder for defining atomic operations
> over the network.

K

> > The downside of having 'ops' pointers neatly organized into
> > functionally group'd structures is now you pay a double dereference
> > cost at every call site, verbs had only a single dereference.
> 
> I agree.
> 
> The trade-off is that the ops pointers can reference static
> structures, which should always be a cache hit.  Verbs places the
> ops in dynamically allocated memory.  So, we reduce the memory
> footprint, and can leave ops NULL where entire sets of operations
> are not supported.

If you keep them as static structures you are going to have even more
overhead:

function(obj):
  if (includes_function(obj->size,function))  // 1 deref and if
    struct ops *ops = obhs->ops;        // 1 deref
    if (ops->function) != null          // 1 deref and if
        ops->function(objs,..);

If you make them dynamic and are very clever then the library itself
can guarentee that every array entry is callable, and the above is
just

function(obj):
  return obj->function(objs,..);

This would require telling the library what version of the API the
application expects so it can allocate dummy 'return ENOSYS' slots to
unsupported entries, which is doable with the right inlines.

The standardized calling convention of 'return int' makes that
possible.

> I guess an app could always declare their own function pointer and
> use it to avoid the double deref.

If you can make it so every function pointer is callable then this is
fairly doable in an app:

   auto fn = obj->function;
   while(1)
       fn(obj,...);

But if the inline has other stuff going on then it doesn't work so
well..

And the above isn't going to be optimizable along the lines of what
Christoph was talking abotu.

> > Also, the complete loss of static type safety by having 'fid_t' be the
> > only argument type seems like a big negative to me..
> 
> I have a note to look at changing this, including possibly removing
> the fid_t data type completely, but that's a significant change, so
> I've been putting it off.

Gets harder the longer you wait :)

Jason



More information about the ofiwg mailing list