[openib-general] Re: [PATCH] kDAPL: convert the ep list to linux native

Tue Jun 14 19:09:16 PDT 2005

On 6/14/05, Christoph Hellwig <hch at lst.de> wrote:
> On Tue, Jun 14, 2005 at 05:20:39PM -0400, James Lentini wrote:
> > >Purely a style issue.
> >
> > Is it a personal style issue or a Linux kernel style issue?
> 
> It's a personal interpretation of the Linux style rule "don't make
> things complicated if they can be done simple and less confusing".
> 

These are attributes that are required for all sub-types that comply
with a parent type. The original code makes the every so slight
mistake of including the IA in that list of types, which has the
horrendous defect of having an IA pointer in the IA itself. 

It would be guaranteed to drive a Java programmer insane,
but most C programmers would figure "How many IA objects
are there anyway. what's the harm in having one meaningless
field in a type that there are only a handful of instances of?
I'll declare one fewer type and "waste" the pointer space,
I'll probably make it up in code space."

Are you implying that implementing the same requirement
multiple time as though it were not a common requirement
somehow makes the code "simpler"?

*ALL* objects created under an IA are destroyed when
the IA is destroyed. If that is going to be implemented by
a linked list through the header then that field truly is
common.

*ALL* DAT objects are required to be thread-safe, if the
library advertises itself as thread-safe. How is placing
the same locking mechanism in each type making the
code "simpler"?

Having a common subheader makes it easier to have
utility code for purposes like lock/unlock. That is why
such common headers are *frequently* used to *simplify*
code and increases it's readability. It helps the reader
know whether they are seeing something unique to
this specific type or something that applies to all DAT
objects.

> Look at the structure, there's:
> 
>   struct dapl_ia *owner_ia;
>
> this insn't actually usefull in all users of dapl_common, and it's easy
> enough to use without it.  No polymorphism argument either because you
> could just pass ->owner_ia to whatever function expects it.
>

Fetching the hca handle is a step common to almost all verb layer
actions taken on all objects. Is there some reason why this information
should be fetched differently for each type?

Actually the common struct should really have the hca handle itself
rather than the ia pointer, since the hca handle can be safely cached
and doing so would improve the efficiency of the code by reducing
one more unneeded memory fetch. The number of memory fetches
is *the* metric of efficiency for RDMA.

>   spinlock_t lock;
> 
> now in every subsystem I looked at putting a lock into every single
> object was a mistake.  You end up with a something as slow and
> undebuggable as Solaris if you do that.  Note sure if it's true for the
> current kDAPL codebase, but in the end there shouldn't be a lock needed
> in every structure.
>

There actually is a solution to that in DAT. It allows the DAT Consumer
to hold the lock rather than individual DAT objects. It is an option that
I would actually expect *most* kernel clients to prefer.

That's the meaning of the "thread safety" flag, it documents *who*
is responsible for making access to each DAT object thread safe:
the DAT Provider or the DAT Consumer. 

Doing an unnecessary spinlock, particularly in an SMP kernel, is
a major destroyer of efficiency. That is why DAT was designed
so that the Consumer could load a "thread optimized" library
(where it was responsible for serializing access to each object)
or a "thread safe" library where the DAT Provider would do so.

But even when the latter is needed the impact of a spinlock
is lessened when the scope of the lock is no wider than needed.
Fastpath operations such as dat_ep_post_send should only 
lock its own send queue. Locking some other object, like the
IA, would cause unneeded blockages. With the lock being
on the scale of the EP, 4 different processors can be concurrently
posting to four different EPs without interference. If you "safe"
that lock by moving it to the IA you'll limit the system to doing
one post at a time no matter how many processor are available.

>   unsigned long flags;              /* saved lock flag values */
> 
> now this one is completely wrong.  the irqflags must always be restored
> in the same function they're used in, and the canoncical way to do that
> is to put them onto the stack, as everyone in the kernel does.  putting
> them in some object on the heap is a total wate of memory and encourages
> using the locks at different levels which is wrong.
> ____

That one I agree with.