[openib-general] OpenSM died a horrible death

Hal Rosenstock halr at voltaire.com
Thu Jan 6 07:27:34 PST 2005


On Thu, 2005-01-06 at 10:03, shaharf wrote:
> Hi Tom,
> 
>  Can you send me the original mail concerning the SM horrible death?
> It was corrupted in our Exchange or it was very large (over 5 MB). If
> you want to send very large log files, please send it tared and
> zipped. If it is only local Exchange problem (praise Bill), then
> please just resend it.

I am enclosing a gzip'd form of his email which should make it through.

> Anyhow, I missed the exact context that it happened. From the below
> email I got the impression that it occurred after get path record with
> dest=null. I didn’t find any relevant assert in the code, and I also
> issued synthetic path record with dest gid = 0 and it works (return
> status 500). 

I think it is an SGID of 0 in the PathRecord request.

The error message in the log came from osm/opensm/osm_sa_path_record.c:

static ib_net16_t
__osm_pr_rcv_get_end_points(
...
    if( *pp_src_port == (osm_port_t*)cl_qmap_end(
          &p_rcv->p_subn->port_guid_tbl ) )
    {
      /*
        This 'error' is the client's fault (bad gid) so
        don't enter it as an error in our own log.
        Return an error response to the client.
      */
      osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
               "__osm_pr_rcv_get_end_points: "
               "No source port with GUID = 0x%016" PRIx64 ".\n",
               cl_ntoh64( p_pr->sgid.unicast.interface_id) );
  
      sa_status = IB_SA_MAD_STATUS_INVALID_GID;
      goto Exit;


-- Hal


-------------- next part --------------
A non-text attachment was scrubbed...
Name: opensm_pathrec_crash_tduffy.gz
Type: application/x-gzip
Size: 145019 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050106/0b7ad5e8/attachment.bin>


More information about the general mailing list