[openib-general] Problem with 2.4.24 and gen1
Ken MacInnis
kcm at psc.edu
Mon Nov 1 09:40:15 PST 2004
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
ACPI was already not in the kernel. Appending 'noapic disableapic' did
work to load the Tavor code. :) Thanks for the hint!
However, now OpenSM is still misbehaving:
- -------------------------------------------------
OpenSM Rev:B1-rc1
Command Line Arguments:
~ Log File: /tmp/osm.log
- -------------------------------------------------
Error from osm_opensm_init (1)
Error from osm_opensm_bind (0x2A)
[1099330621:000868906][4000] -> OpenSM Rev:B1-rc1
[1099330621:000868958][4000] -> osm_opensm_init: Forcing single threaded
dispatcher.
[1099330621:000869383][4000] -> osm_report_notice: Received Generic
Notice type:3 num:66 from LID:0x
0000 GUID:0xfe80000000000000,0x0000000000000000
[1099330621:000869402][4000] -> osm_report_notice: Received Generic
Notice type:3 num:66 from LID:0x
0000 GUID:0xfe80000000000000,0x0000000000000000
[1099330621:000869445][4000] -> __osm_vendor_get_ca_ids: ERR 3D09: No
available channel adapters.
[1099330621:000869456][4000] -> osm_vendor_get_all_port_attr: ERR 3D13:
Fail to get CA Ids .
[1099330621:000869484][4000] -> __osm_vendor_get_ca_ids: ERR 3D11: : Bad
parameter in calling: EVAPI
_list_hcas.
[1099330621:000869493][4000] -> osm_vendor_get_guid_ca_and_port: ERR
3D16: Fail to get CA Ids .
[1099330621:000869503][4000] -> osm_vendor_bind: ERR 5005: Fail to find
port number of port guid:0x0
000000000000000
[1099330621:000869515][4000] -> osm_sm_mad_ctrl_bind: ERR 3118: Vendor
specific bind() failed.
[1099330621:000869526][4000] -> osm_sm_bind: ERR 2E10: SM MAD Controller
bind() failed (IB_ERROR).
Any ideas on this? I did make very sure to check that userland and
opensm was in sync with the kernel bits I'm using. The 0s in the LID
and GUID are concerning me.
I may end up trying the newer OpenIB stack for fun (ha), and see if that
works better.
Ken
Tziporet Koren wrote:
| Hi,
|
| The problem is that the driver does not get the interrupt for the command
| completion,
| and thus you get the error: "Command not completed after timeout".
|
| It is related to the OS & system you are using. What is the
distribution you
| are using? We once saw such problems with older versions of SuSE.
|
| Try to add append="acpi=off" to the lilo you are using or add also
| disableapic in the same append line.
| -----Original Message-----
| From: Ken MacInnis [mailto:kcm at psc.edu]
| Sent: Sunday, October 31, 2004 8:20 PM
| To: openib-general at openib.org
| Subject: [openib-general] Problem with 2.4.24 and gen1
| I've got a fairly modified kernel here I'm trying to get a OpenIB stack
| running on. It's a vanilla 2.4.24 kernel with Lustre and other patches
| in it, but I'm seeing this when I modprobe ib_tavor:
|
| Oct 31 13:13:05 samwise kernel: THH(1): cmdif.c[1190]: Command not
| completed after timeout: cmd=TAV
- --
Ken MacInnis - Systems Engineer, PSC - http://www.psc.edu/~kcm/
kcm at psc dot edu - +1 412 268 9833 (w) - +1 412 268 5832 (f)
Pittsburgh Supercomputing Center - 4400 Fifth Ave - Pittsburgh, PA 15213
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
iD8DBQFBhnT/nT0C17PQhv4RAicqAJ9hRiudNE1Bfof+BDrG09XfA5jD/wCcDH/D
UT/E1V7i0yO6pPPOx9oobNQ=
=R5wl
-----END PGP SIGNATURE-----
More information about the general
mailing list