[ofa-general] Re: running opensm 3.0.3 on 4000+ node system
Hal Rosenstock
hrosenstock at xsigo.com
Wed Apr 9 11:37:58 PDT 2008
On Wed, 2008-04-09 at 12:26 -0600, Maestas, Christopher Daniel wrote:
> I'm trying to run opensm on a 4000+ node system,
Which version ? Do you mean 3.0.3 (or 3.0.13) ?
> and seem to be having difficulties in keeping the opensm around.
> When I attach to the process w/ strace it does:
> ---
> # strace -p 5921
> Process 5921 attached - interrupt to quit restart_syscall(<... resuming interrupted call ...>) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> ...
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, NULL) = 0
> nanosleep({10, 0}, <unfinished ...>
> +++ killed by SIGSEGV +++
> ---
>
> I have ofed 1.1 and 1.2 drivers loaded on the system. I've done this in the past using opensm 3.0.0 svn tag 10188 from ofed 1.0 clients and had no issues before. Here's how opensm is running:
> ---
> 6079 pts/0 Sl 0:08 /usr/sbin/opensm -d 3 -maxsmps 0 -s 300 -t 1000 -f /var/log/osm.log -V -g 0
> ---
>
> I have lots of data in the osm.log as you can imagine ... I don't know offhand what I should be looking at/for.
What's towards the end of the log ?
-- Hal
> Thanks,
> -cdm
>
More information about the general
mailing list