[openib-general] OpenSM died a horrible death

Hal Rosenstock halr at voltaire.com
Wed Jan 5 17:03:07 PST 2005


On Wed, 2005-01-05 at 17:21, Tom Duffy wrote: 
> I noticed this on the console of opensm
> 
> *** glibc detected *** double free or corruption (!prev): 0x0000002a95901eb0 ***
> OpenSM[4593]: *** exception handler: died with signal 6
> exception handler: entering tight loop ... pid 4593

It looks like it died shortly after the following error:
[1104994208:000136814][43005960] -> __osm_pr_rcv_get_end_points: No source port with GUID = 0x0000000000000000

To state the obvious, OpenSM must have a (non-obvious) bug (a double
free of some memory) when this occurs. I will try to recreate this.

Do you know what was going on on the subnet at the time ? Did a end node
SA client request a PathRecord with a SGID of 0 but turn the component
mask bit for SGID on ?

Also, out of curiousity, what were the options that OpenSM was started
with ?

-- Hal




More information about the general mailing list