[ofw] OpenSM with HPC

Anatoly Greenblatt anatolyg at voltaire.com
Thu Oct 2 04:52:11 PDT 2008


Hi,

 

Our client reported problems running over 192 concurrent jobs with
OpenSM. The jobs are executed several times. After a while the memory
usage of OpenSM goes to ~30MB, cpu usage to 100% and eventually the node
freezes and needs to be reset.

 

Configuration:

Winof rev 1596 (~rc1)

ConnectX HCA

Windows 2008 x64 with HPC pack rc2

NetworkDirect is installed

OpenSM is running as a service on the head node.

About a hundred nodes are used (maybe more, I don't have exact number
yet)

 

Has anyone any thoughts about this?

 

Thanks,

Anatoly.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081002/4cbad0e9/attachment.html>


More information about the ofw mailing list