[openib-general] OpenSM realloc error
Owen Stampflee
ostampflee at terrasoftsolutions.com
Thu Feb 16 13:27:47 PST 2006
So, here is the back trace with no code modifications...
0x00000080b9719db0 in .__GI_raise () from /lib64/tls/libc.so.6
(gdb) bt
#0 0x00000080b9719db0 in .__GI_raise () from /lib64/tls/libc.so.6
#1 0x00000080b971b89c in .__GI_abort () from /lib64/tls/libc.so.6
#2 0x00000080b974e860 in .__libc_message () from /lib64/tls/libc.so.6
#3 0x00000080b97580bc in ._int_realloc () from /lib64/tls/libc.so.6
#4 0x00000080b9759528 in .__realloc () from /lib64/tls/libc.so.6
#5 0x00000080b975942c in .__realloc () from /lib64/tls/libc.so.6
#6 0x00000080b974cd30 in ._IO_mem_finish () from /lib64/tls/libc.so.6
#7 0x00000080b97426b8 in ._IO_new_fclose () from /lib64/tls/libc.so.6
#8 0x00000080b97b795c in .__GI_vsyslog () from /lib64/tls/libc.so.6
#9 0x00000080b97b7ddc in .__GI_syslog () from /lib64/tls/libc.so.6
#10 0x00000080a362be90 in .cl_log_event ()
from /usr/lib64/libosmcomp.so.1
#11 0x00000080a35f5700 in .osm_log () from /usr/lib64/libopensm.so.1
#12 0x000000001001316c in ?? ()
#13 0x00000000100059b4 in ?? ()
#14 0x00000080b970411c in .generic_start_main ()
from /lib64/tls/libc.so.6
#15 0x00000080b97042a4 in .__libc_start_main ()
from /lib64/tls/libc.so.6
#16 0x0000000000000000 in ?? ()
(gdb)
Commenting out the cl_log_event in osm_log results in this backtrace:
(gdb) bt
#0 0x00000080b9719db0 in .__GI_raise () from /lib64/tls/libc.so.6
#1 0x00000080b971b89c in .__GI_abort () from /lib64/tls/libc.so.6
#2 0x00000080b974e860 in .__libc_message () from /lib64/tls/libc.so.6
#3 0x00000080b9756db0 in ._int_malloc () from /lib64/tls/libc.so.6
#4 0x00000080b9758b50 in .__GI___libc_malloc ()
from /lib64/tls/libc.so.6
#5 0x00000400000607bc in __cl_malloc_priv (size=0) at
cl_memory_osd.c:62
#6 0x00000400000604d4 in __cl_zalloc_ntrk (size=0) at cl_memory.c:416
#7 0x00000400000629f4 in cl_ptr_vector_set_capacity
(p_vector=0x100788d0,
new_capacity=6349) at cl_ptr_vector.c:216
#8 0x0000040000062acc in cl_ptr_vector_set_size (p_vector=0x0, size=16)
at cl_ptr_vector.c:270
#9 0x0000040000062c08 in cl_ptr_vector_init (p_vector=0x100788d0,
min_size=6349,
grow_size=16) at cl_ptr_vector.c:93
#10 0x000004000005bb00 in cl_disp_init (p_disp=0x100788a0,
thread_count=0,
name=0x100464c0 "opensm") at cl_dispatcher.c:214
#11 0x00000000100133f8 in ?? ()
#12 0x00000000100059b4 in ?? ()
#13 0x00000080b970411c in .generic_start_main ()
from /lib64/tls/libc.so.6
#14 0x00000080b97042a4 in .__libc_start_main ()
from /lib64/tls/libc.so.6
#15 0x0000000000000000 in ?? ()
So now I've compiled it in 32-bit mode (had to fix my chroot) and
everything runs, but I get the following message...
Feb 16 13:59:28 006732 [0000] -> OpenSM Rev:openib-1.1.0
Feb 16 13:59:28 008210 [F7E8D020] -> osm_report_notice: Reporting
Generic Notice type:3 num:66 from LID:0x0000
GID:0xfe80000000000000,0x0000000000000000
Feb 16 13:59:28 008292 [F7E8D020] -> osm_report_notice: Reporting
Generic Notice type:3 num:66 from LID:0x0000
GID:0xfe80000000000000,0x0000000000000000
Feb 16 13:59:28 015894 [F7E8D020] -> osm_vendor_get_all_port_attr:
assign CA mthca0 port 1 guid (0x2c90109764831) as the default port
Feb 16 13:59:28 015977 [F7E8D020] -> osm_vendor_bind: Binding to port
0x2c90109764831.
Feb 16 13:59:28 021293 [F7E8D020] -> osm_vendor_bind: Binding to port
0x2c90109764831.
Feb 16 13:59:28 021692 [F568C4E0] -> umad_receiver: ERR 5413: Failed to
obtain request madw for received MAD(method=0x81 attr=0x11) -- dropping
Other info:
[root at m2 ~]# ibstat
CA 'mthca0'
CA type: MT23108
Number of ports: 2
Firmware version: 3.3.2
Hardware version: a1
Node GUID: 0x0002c90109764830
System image GUID: 0x0002c90109764833
Port 1:
State: Initializing
Physical state: LinkUp
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x00510a68
Port GUID: 0x0002c90109764831
Port 2:
State: Down
Physical state: Polling
Rate: 2
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x00510a68
Port GUID: 0x0002c90109764832
[root at m2 ~]# ibstatus
Infiniband device 'mthca0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c901:0976:4831
base lid: 0x0
sm lid: 0x0
state: 2: INIT
phys state: 5: LinkUp
rate: 10 Gb/sec (4X)
Infiniband device 'mthca0' port 2 status:
default gid: fe80:0000:0000:0000:0002:c901:0976:4832
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 2.5 Gb/sec (1X)
My archives suggest a firmware upgrade, but 3.3.3 isnt available from
SBS as far as I can tell and my contact no longer works there so I'm
going to have to find the new person to talk about getting newer
firmware, unless of course another vendors firmware will work on this
card.
Cheers,
Owen
More information about the general
mailing list