[ofa-general] Further 2.6.23 merge plans...

Wed Jul 18 16:40:38 PDT 2007

On Wed, Jul 18, 2007 at 03:53:36PM -0700, Sean Hefty wrote:

> There are a couple of benefits.  The number of PR queries is reduced
> from O(n^2) to O(n).  The queries can also be done once up front,
> even started at different times if needed, rather than all at once
> at job startup.  The jobs are also able to make progress even if the
> SA dies or is unreachable.

Do you mean each node changes from O(local_cpus*nodes) -> O(nodes) ?
Globally, from cold cache start you should still be O(n^2)?

> >I'm trying to say, I think a simple kernel cache itself is fine, but
> >there should be only 1 cache (get rid of ipoib) and it should have a
> >really good interface to userspace so that the really hard problems
> >can be solved through user space code.
> 
> I don't disagree, but (for now anyway) I believe that the natural
> interface for communicating with an SA related agent is a MAD
> interface based on the SA management class for the reasons I
> mentioned earlier.  But this is really talking about extensions to
> the local SA patch, rather than addressing anything fundamentally
> wrong with the current patch set.

OK - thats fine then. When you get around to doing the user space side
I'll argue for netlink :) Having written both netlink user space code
and mad code, I can say netlink is way better!

Only other thing I'd see is to have the cache be on by default (ie
included by default in distro kernels) it really needs a default short
life time for cached entries as a work around for a coherence
protocol..

Jason