[ewg] Huge size for opensm process (5.7GB)

Doug Ledford dledford at redhat.com
Mon Sep 30 11:26:24 PDT 2013


On 09/26/13 15:28, Sandeep Dhavale wrote:
> Hello,
> 
> I have a setup where 2 nodes are connected back to back. So my subnet is
> basically 2 nodes.
> 
> This is RHEL6.3 setup and opensm running is
> 
>  
> 
> [root at intel-eva1 ~]# rpm -qi opensm
> 
> Name        : opensm                       Relocations: (not relocatable)
> 
> Version     : 3.3.13                            Vendor: Red Hat, Inc.
> 
> Release     : 1.el6                         Build Date: Tue 28 Feb 2012
> 07:53:06 PM PST
> 
> Install Date: Thu 04 Jul 2013 01:44:25 AM PDT      Build Host:
> x86-003.build.bos.redhat.com
> 
> Group       : System Environment/Daemons    Source RPM:
> opensm-3.3.13-1.el6.src.rpm
> 
> Size        : 1317469                          License: GPLv2 or BSD
> 
> Signature   : RSA/8, Wed 30 May 2012 11:14:47 AM PDT, Key ID
> 199e2f91fd431d51
> 
> Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
> 
> URL         : http://www.openfabrics.org/
> 
> Summary     : OpenIB InfiniBand Subnet Manager and management utilities
> 
>  
> 
> The opensm process has grown up and is now 5.7GB. I do not think this is
> normal for a subnet of size 2 nodes.
> 
>  
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  DATA COMMAND
> 
> 15306 root      20   0 5851m 2568  692 S  0.0  0.0   0:06.04 5.7g opensm
> 
>  
> 
> [root at intel-eva1 ~]# ps -ef | grep  opensm
> 
> root     15306     1  0 01:04 ?        00:00:06 /usr/sbin/opensm -B -F
> /etc/rdma/opensm.conf.[0-9]*
> 
>  
> 
> One thing to notice is the /var/log/opensm.log is flooded with messages
> like below every 10 second:
> 
>  
> 
> Sep 26 12:21:58 384194 [E2BA2700] 0x01 -> osm_prtn_make_partitions:
> Partition configuration /etc/rdma/partitions.conf is not accessible (No
> such file or directory)
> 
> Sep 26 12:21:58 385098 [E2BA2700] 0x02 -> SUBNET UP
> 
> Sep 26 12:22:08 384306 [E2BA2700] 0x01 -> osm_prtn_make_partitions:
> Partition configuration /etc/rdma/partitions.conf is not accessible (No
> such file or directory)
> 
> Sep 26 12:22:08 385189 [E2BA2700] 0x02 -> SUBNET UP
> 
>  
> 
> Can anybody put some light on what might be wrong? Does opensm maintain
> the log in memory as well? I haven’t put a limit on the size of the log
> in /etc/rdma/opensm.conf.

You're hitting a bug in the opensm startup script on that release.  It's
trying to run with a non-existent config file (this was fixed by adding
shopt -s nullglob to the startup script).  I would update to the later
opensm from rhel6.4 where this is no longer an issue.


-- 
Doug Ledford <dledford at redhat.com>
              GPG KeyID: 0E572FDD
	      http://people.redhat.com/dledford




More information about the ewg mailing list