[ofw] OpenSM with HPC

Smith, Stan stan.smith at intel.com
Fri Oct 3 08:55:54 PDT 2008


Gentlemen,
  There are really two bigger issues at play here:

1) openSM stability such that WinOF 2.0 can be released.

2) Scaling OpenSM/IPoIB & friends, such that > 32 HPC nodes function correctly.

Until recent IPoIB and OpenSM patches, #1 was good enough for small numbers of nodes until we started down the path of fixing #2. WinOF 1.1 successfully tested 17 nodes with MPI stress tests (68 MPI ranks); WinOF 2.0 rc2 comes no where close.

As a community we need to decide the importance of scaling w.r.t. releasing WinOF 2.0.
We have been cycling around the scaling issue for over 4 weeks now....
Correctly addressing the scaling issues of OpenSM/IPoIB are a much larger task measured in multiple months; not suitable for meeting WinOF 2.0 release goals.

First, I would strongly suggest we focus on the stability of openSM/IPoIB for <= 32 nodes with a registry entry which controls if IPoIB queries OpenSM for path records (the before avoid SM patch approach) or not (avoid the SM patch); default reg entry setting is to query (before avoid the SM patch behavior).
We had discussed a registry entry control, although somehow it was never implemented??
Would those involved with patching IPoIB please address this.

Second, we verify stability of WinOF 2.0 (rc3) and release WinOF 2.0.

Third, we figure out who is going to take charge of scaling OpenSM/IPoIB and then discuss how this is to be accomplished.

How say you?

Stan.





Yevgeny Kliteynik wrote:
> Hi Anatoly,
>
> Missed Hal's response when I wrote mine.
> As Hal says, Windows SM version is old. IMHO, fixing problems like
> you've reported it is not even an option, because it will involve
> serious changes to existing features or implementing additional ones,
> which was already done in the Linux OpenSM.
> So I can only reiterate what Hal has already said - OpenSM in Windows
> needs to be updated to the latest and greatest.
>
> -- Yevgeny
>
>
> Yevgeny Kliteynik wrote:
>> Hi Anatoly,
>>
>> I need more details:
>>
>> Anatoly Greenblatt wrote:
>>> Hi,
>>>
>>> Our client reported problems running over 192 concurrent jobs with
>>> OpenSM.
>>
>> What kind of cluster does your client have?
>> How many hosts? How many switches?
>>
>> What do these jobs do?
>> Are these MPI jobs? Do they use/create multicast groups? Something
>> else? How many processes each job has?
>>
>>> The jobs are executed several times. After a while the memory usage
>>> of OpenSM goes to ~30MB, cpu usage to 100% and eventually the node
>>> freezes and needs to be reset.
>>
>> Is the problem reproducible?
>> Can you send me SM log?
>>
>> -- Yevgeny
>>
>>>
>>>
>>> Configuration:
>>>
>>> Winof rev 1596 (~rc1)
>>>
>>> ConnectX HCA
>>>
>>> Windows 2008 x64 with HPC pack rc2
>>>
>>> NetworkDirect is installed
>>>
>>> OpenSM is running as a service on the head node.
>>>
>>> About a hundred nodes are used (maybe more, I don't have exact
>>> number yet)
>>>
>>>
>>>
>>> Has anyone any thoughts about this?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Anatoly.
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> ofw mailing list
>>> ofw at lists.openfabrics.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
>>
>> _______________________________________________
>> ofw mailing list
>> ofw at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
>>
>
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw




More information about the ofw mailing list