[ofw] FYI - WinOF RC4 build status - waiting on patch review &commit.

Smith, Stan stan.smith at intel.com
Tue Oct 21 15:48:27 PDT 2008


Tzachi Dar wrote:
> Since this patch is changing a very sensitive area in IPOIB there are
> three things that I would like to ask:
>
> 1) Taking into consideration that even without this patch there is a
> cluster with 2000 nodes, how important is this patch?

Any method which we can reduce transaction pressure on the SA is good in terms of large MPI job startup. OFED testing on large clusters demonstrated SA transaction rates were a limiting factor in large node count MPI job startup times. Following closely behind SA transaction times in terms of cost, were ARP reply processing times.

I'm not familiar with the 2000 node system you speak of or what was actually accomplished on the 2000 nodes? An MPI job I suspect?

Would a patch such as this reduce MPI startup time, by some factor (at least 2000 less SA query operations)?
Reducing startup time by minutes is very good, small numbers of seconds...interesting but not so important.
I don't know if this patch would have that kind of effect on a large system; you or the Voltaire patch developers would know better.


>
> 2) Will it be possible to check this in only to the trunk and not to
> the branch?

Certainly possible. I see the question as where does one believe the highest degree of testing will occur? Release branch or mainline?

>
> 3) How much testing did Voltaire did with this patch.

Since the patch involves using already acquired local MAD information instead of an SA query, and the MAD information had been used before, then there exists some degree of confidence in the MAD data.
What's the possibility the data has gone bad?

What problems are there in getting the data to the correct consumers?

In my limited understanding, I could see the patch being fairly easy to determine if it's working or not; yes?

I do not know the extent of testing which was applied?
Perhaps Voltaire developers can enlighten us?

The WinOF release members are ready to start digesting RC4; the point being if/when you feel the patch is good to go - others can assist in testing.


>From a WinOF 2.0 release schedule point of view:

1) what problems are resolved by including this patch
2) Do those problems merit further delay in WinOF 2.0 release?
3) How long will it take to verify patch correctness?

Your questions and the WinOF release schedule impact questions can all be discussed in the upcoming WWG meeting Wednesday.

Thank you for the good questions.
Looking forward to a lively discussion.

Stan.



>
> Thanks
> Tzachi
>
>> -----Original Message-----
>> From: ofw-bounces at lists.openfabrics.org
>> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Smith, Stan
>> Sent: Tuesday, October 21, 2008 6:56 PM
>> To: ofw at lists.openfabrics.org
>> Subject: [ofw] FYI - WinOF RC4 build status - waiting on
>> patch review &commit.
>>
>>
>> Waiting for review & commit of 'Using ib_local_mad instead of
>> SM query' patch.
>>
>> Stan.
>> _______________________________________________
>> ofw mailing list
>> ofw at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw




More information about the ofw mailing list