[openib-general] gen2 dev branch

Yaron Haviv yaronh at voltaire.com
Thu Jul 29 08:44:57 PDT 2004


On Thursday, July 29, 2004 5:31 PM, Roland Dreier wrote:
>     Yaron> The rather simplistic QP1 approach suggested by Roland
>     Yaron> cannot work for others who did implement and built on top
>     Yaron> of functionality such as RMPP and Redirect, not to mention
>     Yaron> its poor scalability that cannot service the deployments we
>     Yaron> are used to.
> 
> Why can't it work?  I think the fact that Topspin's drivers work on
> 500+ node fabrics with such a simple-minded MAD layer is a strong
> argument against other layers, which look over-engineered and
> inflexible to me. 

You are not working with multipathing, with multiple HCA's per server,
and tons of CPU's per node in many nodes configurations as far as I
know, This requires multiple SA consumers, RMPP for GetTable and
MultiPathQuery, And very high SA performance, and I also don't know what
you implemented for Traps and Report support and how you use it in such
large configurations.
I don't see where is the over engineering in our/Todd suggestion, there
are only few params in the API, we just suggest to put more things in a
common place, and you are suggesting to scatter them all over and take
us backwards.
I do know it is different than your implementation, maybe that is the
reason for the strong resistance :)
The gsi.h file was already change to accommodate your genuine memory
management concerns, we appreciate any productive suggestions, and some
of the work we do now is to adopt the implementation to it 

(few numbers: our SM manages configuration with thousands of ports, have
systems with 512 CPU's on a single system, and dozens of HCA's in each,
and staging a cluster with over 10,000 CPUs if you read the news, so we
may have learned one or two things about IB scalability and efficiency
the hard way, its not all about how the code looks).
 
>     Yaron> I didn't here from Roland any reason why the proposed gsi
>     Yaron> doesn't answer his ULP's functional requirements, I believe
>     Yaron> it does, and he can benefit from the
>     Yaron> added value in future as well.
> 
> I'm very concerned about the lack of flexibility in the proposed API.
> For example, only allowing one manager to register per GSI class
> wouldn't work well for a subnet manager that wants to handle
> multicast queries in one thread and path record requests in another
> thread.   

First that's not how OpenSM works, or planned to work in the near
future, and it is more important to make the simple use cases more
efficient by eliminating additional filtering and extra copies, but as I
mentioned before the attribute field can be added  if you really need it
for your ULP's (it is not changing the gsi model just adding a param)
I'm open to here other real use cases you feel are not met by our
proposal, and why you think it is complex
I think the TID based demux for clients is much simpler than the
multi-filter, multi-copy approach you suggest 
The Redirect code can be #if 0 initially if you are concerned about
having it there 

Also can you describe how can we test our cleaned gsi version with mthca
and the new core ? Will you put it in tree as Sean suggested ? Is there
an .h file with additional access api's beyond verbs you can point us to
?

Yaron  



More information about the general mailing list