[Users] OpenSM high cpu usage on ESXi

Raphaƫl SCHITZ raphael at schitz.net
Fri Apr 5 17:24:08 PDT 2013


Hi,

To start practicing infinband in my personal home lab, i managed to compile OpenSM for ESXi to avoid buying an expansive switch and do back-to-back wiring between two HP ML110 servers and Mellanox Connect X cards. The trick is compiling the binary on CentOS 3.9 i386 and that makes is usable on ESXi but i had to modify some device path access (/sys/class/infiniband to /proc/infiniband and /dev/infiniband to /dev) in the source files of OpenSM.

It's working but i have some issues and they might be related.

First, the cpu usage of two of the OpenSM processes are too high (almost 100% each) and makes me think of a cpu loop or something similar.
Second, i got a constant massive flow of this error in the opensm.log : [2F66AB90] 0x01 -> umad_receiver: ERR 5404: recv error on MAD sized umad (Resource temporarily unavailable)

Could some one help me to understand and solve this ?

Thanks
RS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20130406/ec899ed4/attachment.html>


More information about the Users mailing list