[ofiwg] Libfabric and defensive programming

Fri Jun 8 11:22:12 PDT 2018

> Why does there appear to be little to no defensive programming in
> libfabric?  Was this a conscious decision?

Yes - this is intentional to keep the software code paths low.  We don't try to trap for coding errors as part of run time checks.

> For example, I am not seeing any checks for NULL object pointers in
> the libfabric interface functions.  They just dereference the object
> to get the provider function pointer.  If NULL does get passed in,
> the process crashes.  Because these are inline functions, they do
> not show up in a traceback which makes things very confusing to
> debug for anyone not fully familiar with libfabric source code.
> 
> Would it not be better to at least check for NULL object parameters
> in these “front-end” function and return FI-EINVAL?

I strongly prefer using asserts() over null checks, particularly for fast path operations.  Inserting a null check on every message transfer is just unneeded overhead.  The app can't realistically handle a failure, since it's a coding error, not a run time error.

However, now that we have the hook provider, if someone wanted to implement a 'defensive coding' hook to validate all parameters, I'd be fine with that.  But this isn't overhead that I would want added to all providers.

- Sean