[openib-general] gen2 opensm

Roland Fehrenbacher Roland.Fehrenbacher at transtec.de
Wed Apr 6 07:44:05 PDT 2005


> $ /usr/local/ib/bin/ibstatus
> Infiniband device 'mthca0' port 1 status:
>         default gid:     fe80:0000:0000:0000:0002:c902:0000:771d
>         base lid:        0x0
>         sm lid:          0x0
>         state:           2: INIT
>         phys state:      5: LinkUp
>         rate:            10 Gb/sec (4X)
> 
> Infiniband device 'mthca0' port 2 status:
>         default gid:     fe80:0000:0000:0000:0002:c902:0000:771e
>         base lid:        0x0
>         sm lid:          0x0
>         state:           1: DOWN
>         phys state:      2: Polling
>         rate:            2.5 Gb/sec (1X)

    Hal> That's strange that you can get the port GIDs via ibstatus
    Hal> but not via ibstat.

    Hal> The one thing different I see is that the NodeGUID is very
    Hal> different from the PortGUIDs. Not sure if this messes things
    Hal> up.

Somehow the tools don't seem to get the correct information, but it's
there:

$ cat /sys/class/infiniband/mthca0/node_guid
0002:c902:0000:771c

$ cat /sys/class/infiniband/mthca0/sys_image_guid
0002:c902:0000:771f

How can this happen?

> > Running opensm then doesn't activate the ports:
> > 
> > Apr 05 19:18:25 [4000] -> OpenSM Rev:openib-1.0.0 .......

     Hal> I see a bug in this message. I will fix it. Please sync
     Hal> OpenSM to at least version 2111 and rerun.

> I will recompile tomorrow, and try a firmware upgrade.

The error log with the recompiled opensm is now:

Apr 06 14:39:14 [4000] -> OpenSM Rev:openib-1.0.0
Apr 06 14:39:14 [4000] -> osm_opensm_init: Forcing single threaded dispatcher.
Apr 06 14:39:14 [4000] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0000 GID:0x0000000030f2ffff,0x0000000000000000
Apr 06 14:39:14 [4000] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0000 GID:0x0000000030f2ffff,0x0000000000000000
Apr 06 14:39:14 [4000] -> osm_vendor_get_all_port_attr: assign CA mthca0 port 1 guid (0x65babaa) as the default port.
Apr 06 14:39:14 [4000] -> osm_vendor_bind: Binding to port 0x225dabaa.
Apr 06 14:39:14 [4000] -> osm_vendor_bind: Binding to port 0x8000000.
Apr 06 14:39:14 [2400A] -> umad_receiver: Failed to obtain request madw for received MAD(method=81 attr=11) -- dropping.

I couldn't do a firmware update yet, since I haven't gotten Mellanox
mst to compile with kernel 2.6.11. Do you have another suggestion how
I could do the upgrade?

Thanks,

Roland




More information about the general mailing list