[openib-general] gen2 opensm
Hal Rosenstock
halr at voltaire.com
Tue Apr 5 10:42:16 PDT 2005
On Tue, 2005-04-05 at 13:26, Roland Fehrenbacher wrote:
> Hi,
>
> I have tried the kernel 2.6.11 drivers on an x86-64 machine with a
> MT23108 card. The driver loads ok after
> $ modprobe ib_mthca; modprobe ib_umad
>
> Since I use devfs, I have to manually create
>
> $ mknod /dev/infiniband/umad0 c 231 0
> $ mknod /dev/infiniband/umad1 c 231 1
> $ mknod /dev/infiniband/issm0 c 231 64
> $ mknod /dev/infiniband/issm1 c 231 65
What are the permissions on those ? Are they crw ?
> I get
>
> $ /usr/local/ib/bin/ibstat
> CA 'mthca0'
> CA type: MT23108
> Number of ports: 2
> Firmware version: 3.2.0
> Hardware version: a1
> Node GUID: 0x000000008815bcaa
> System image GUID: 0x000000008815bcaa
> Port 1:
> State: Initializing
> Physical state: LinkUp
> Rate: 10
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x00500a68
> Port GUID: 0x0000000000000000
> Port 2:
> State: Down
> Physical state: Polling
> Rate: 2
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x00500a68
> Port GUID: 0x0000000000000000
>
> which already looks strange (GUID 0 ???).
It looks like the port GUIDs are not set in NVRAM.
> Running opensm then doesn't activate the ports:
>
> Apr 05 19:18:25 [4000] -> OpenSM Rev:openib-1.0.0
> Apr 05 19:18:25 [4000] -> osm_opensm_init: Forcing single threaded dispatcher.
> Apr 05 19:18:25 [4000] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0000 GID:0x0000000030f2ffff,0x0000000000000000
> Apr 05 19:18:25 [4000] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0000 GID:0x0000000030f2ffff,0x0000000000000000
> Apr 05 19:18:25 [4000] -> osm_vendor_get_all_port_attr: assign CA 0x7fffffffd010ort 1 guid (0x65babaa) as the default port.
I see a bug in this message. I will fix it. Please sync OpenSM to at
least version 2111 and rerun.
> Apr 05 19:18:25 [4000] -> osm_vendor_bind: Binding to port 0x225dabaa.
> Apr 05 19:18:25 [4000] -> osm_vendor_bind: Binding to port 0x8000000.
Two binds. This looks wrong to me.
> Apr 05 19:18:25 [2400A] -> umad_receiver: Failed to obtain request madw for received MAD(method=81 attr=11) -- dropping.
The vendor layer couldn't find the matching request to a response which
came in. This is pretty fishy but probably related to the port issue.
> What could have gone wrong?
I would start with setting the port GUIDs for this HCA and see if the
problem persists.
-- Hal
>
> Roland
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list