[openib-general] Some More Operational Issues with OpenSM 1.1.0
Hal Rosenstock
halr at voltaire.com
Tue Sep 13 07:16:56 PDT 2005
Hi,
Here are some additional operational issues with OpenSM 1.1.0:
1. The following warning now appears when OpenSM is started up:
opensm: /usr/local/lib/libopensm.so.1: no version information available (required by opensm)
2. Not sure what the LID manager doesn't like about the old settings
(from OpenSM 1.1.0).
Sep 13 09:34:59 330140 [B7F144A0] -> __osm_lid_mgr_validate_db: [
Sep 13 09:34:59 330260 [B7F144A0] -> __osm_lid_mgr_validate_db: ERR 0312: Ilegal LID range [0x4:0x0] for guid:0x0008f10403961355.
Sep 13 09:34:59 330289 [B7F144A0] -> osm_db_delete: [
Sep 13 09:34:59 330313 [B7F144A0] -> osm_db_delete: ]
Sep 13 09:34:59 330337 [B7F144A0] -> __osm_lid_mgr_validate_db: ERR 0312: Ilegal LID range [0x3:0x0] for guid:0x0008f10403960559.
Sep 13 09:34:59 330360 [B7F144A0] -> osm_db_delete: [
Sep 13 09:34:59 330379 [B7F144A0] -> osm_db_delete: ]
Sep 13 09:34:59 330402 [B7F144A0] -> __osm_lid_mgr_validate_db: ERR 0312: Ilegal LID range [0x5:0x0] for guid:0x005442ba00003080.
Sep 13 09:34:59 330424 [B7F144A0] -> osm_db_delete: [
Sep 13 09:34:59 330443 [B7F144A0] -> osm_db_delete: ]
Sep 13 09:34:59 330466 [B7F144A0] -> __osm_lid_mgr_validate_db: ERR 0312: Ilegal LID range [0x7:0x0] for guid:0x0008f1040396055a.
Sep 13 09:34:59 330535 [B7F144A0] -> osm_db_delete: [
Sep 13 09:34:59 330556 [B7F144A0] -> osm_db_delete: ]
3. LinearFDBTop is being detected as corrupted. This is bad.
Sep 13 09:34:59 732496 [B7713C40] -> osm_si_rcv_process: [
Sep 13 09:34:59 732514 [B7713C40] -> osm_si_rcv_process: Switch GUID = 0x0008f10400410015, TID = 0x1273.
Sep 13 09:34:59 732535 [B7713C40] -> osm_si_rcv_process: ERR 3610:
Bad LinearFDBTop value = 0xC000 on switch 0x8f10400410015.
Forcing correction to 0x0.
4. SM Set PortInfo being rejected with status 7. Not sure why that would
be. Also, in this case (and probably others which are similar), OpenSM
continues as if things succeeded. Is that right ?
Sep 13 09:35:00 326832 [B6F13BC0] -> SMP dump:
base_ver................0x1
mgmt_class..............0x81
class_ver...............0x1
method..................0x2 (SubnSet)
D bit...................0x0
status..................0x0
hop_ptr.................0x0
hop_count...............0x1
trans_id................0x12c9
attr_id.................0x15 (PortInfo)
resv....................0x0
attr_mod................0xA
m_key...................0x0000000000000000
dr_slid.................0xFFFF
dr_dlid.................0xFFFF
Initial path: [0][1]
Return path: [0][0]
Reserved: [0][0][0][0][0][0][0]
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 0C 03 03 02
14 02 00 11 40 40 00 08 08 04 F2 40 00 00 00 00
00 00 00 00 00 88 00 00 00 00 00 00 00 00 00 00
Sep 13 09:35:00 326970 [B6F13BC0] -> osm_vendor_send: [
Sep 13 09:35:00 327426 [B6F13BC0] -> osm_vendor_send: Completed Sending Request p_madw = 0x80a44a8.
Sep 13 09:35:00 327453 [B6F13BC0] -> osm_vendor_send: ]
Sep 13 09:35:00 327473 [B6F13BC0] -> __osm_vl15_poller: 1 on wire, 6 outstanding, 0 unicasts sent, 150 sent total.
Sep 13 09:35:00 327634 [B5F13AC0] -> osm_mad_pool_get: [
Sep 13 09:35:00 327755 [B5F13AC0] -> osm_vendor_get: [
Sep 13 09:35:00 327775 [B5F13AC0] -> osm_vendor_get: Acquiring UMAD for p_madw = 0x80a46c4, size = 256.
Sep 13 09:35:00 327893 [B5F13AC0] -> osm_vendor_get: Acquired UMAD 0x80dbb18, size = 256.
Sep 13 09:35:00 327914 [B5F13AC0] -> osm_vendor_get: ]
Sep 13 09:35:00 327933 [B5F13AC0] -> osm_mad_pool_get: Acquired p_madw = 0x80a46b8, p_mad = 0x80dbb50, size = 256.
Sep 13 09:35:00 328050 [B5F13AC0] -> osm_mad_pool_get: ]
Sep 13 09:35:00 328070 [B5F13AC0] -> __osm_sm_mad_ctrl_rcv_callback: [
Sep 13 09:35:00 328183 [B5F13AC0] -> __osm_sm_mad_ctrl_rcv_callback: 150 QP0 MADs received.
Sep 13 09:35:00 328362 [B5F13AC0] -> SMP dump:
base_ver................0x1
mgmt_class..............0x81
class_ver...............0x1
method..................0x81 (SubnGetResp)
D bit...................0x1
status..................0x1C00
hop_ptr.................0x0
hop_count...............0x1
trans_id................0x12c9
attr_id.................0x15 (PortInfo)
resv....................0x0
attr_mod................0xA
m_key...................0x0000000000000000
dr_slid.................0xFFFF
dr_dlid.................0xFFFF
Initial path: [0][1]
Return path: [0][C]
Reserved: [0][0][0][0][0][0][0]
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 0C 03 03 02
14 52 00 11 40 40 00 08 08 04 F2 40 00 00 00 00
00 00 00 00 00 88 00 00 00 00 00 00 00 00 00 00
Sep 13 09:35:00 328481 [B5F13AC0] -> __osm_sm_mad_ctrl_rcv_callback: ERR 3111: Error status = 0x1C00.
Sep 13 09:35:00 328655 [B5F13AC0] -> SMP dump:
base_ver................0x1
mgmt_class..............0x81
class_ver...............0x1
method..................0x81 (SubnGetResp)
D bit...................0x1
status..................0x1C00
hop_ptr.................0x0
hop_count...............0x1
trans_id................0x12c9
attr_id.................0x15 (PortInfo)
resv....................0x0
attr_mod................0xA
m_key...................0x0000000000000000
dr_slid.................0xFFFF
dr_dlid.................0xFFFF
Initial path: [0][1]
Return path: [0][C]
Reserved: [0][0][0][0][0][0][0]
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 0C 03 03 02
14 52 00 11 40 40 00 08 08 04 F2 40 00 00 00 00
00 00 00 00 00 88 00 00 00 00 00 00 00 00 00 00
Sep 13 09:35:00 336766 [B7713C40] -> osm_pi_rcv_process: [
Sep 13 09:35:00 336786 [B7713C40] -> PortInfo dump:
port number.............0xA
node_guid...............0x005442ba00003080
port_guid...............0x005442ba00003080
m_key...................0x0000000000000000
subnet_prefix...........0x0000000000000000
base_lid................0x0
master_sm_base_lid......0x0
capability_mask.........0x0
diag_code...............0x0
m_key_lease_period......0x0
local_port_num..........0xC
link_width_enabled......0x3
link_width_supported....0x3
link_width_active.......0x2
link_speed_supported....0x1
port_state..............ACTIVE
state_info2.............0x52
m_key_protect_bits......0x0
lmc.....................0x0
link_speed..............0x11
mtu_smsl................0x40
vl_cap..................0x40
vl_high_limit...........0x0
vl_arb_high_cap.........0x8
vl_arb_low_cap..........0x8
mtu_cap.................0x4
vl_stall_life...........0xF2
vl_enforce..............0x40
m_key_violations........0x0
p_key_violations........0x0
q_key_violations........0x0
guid_cap................0x0
subnet_timeout..........0x0
resp_time_value.........0x0
error_threshold.........0x88
Sep 13 09:35:00 336954 [B7713C40] -> Capabilities Mask:
Sep 13 09:35:00 336999 [B7713C40] -> osm_pi_rcv_process_set: [
Sep 13 09:35:00 337018 [B7713C40] -> osm_pi_rcv_process_set: ERR 0F10: Received Error Status for SetResp()
Sep 13 09:35:00 337133 [B7713C40] -> PortInfo dump:
port number.............0xA
node_guid...............0x005442ba00003080
port_guid...............0x005442ba00003080
m_key...................0x0000000000000000
subnet_prefix...........0x0000000000000000
base_lid................0x0
master_sm_base_lid......0x0
capability_mask.........0x0
diag_code...............0x0
m_key_lease_period......0x0
local_port_num..........0xC
link_width_enabled......0x3
link_width_supported....0x3
link_width_active.......0x2
link_speed_supported....0x1
port_state..............ACTIVE
state_info2.............0x52
m_key_protect_bits......0x0
lmc.....................0x0
link_speed..............0x11
mtu_smsl................0x40
vl_cap..................0x40
vl_high_limit...........0x0
vl_arb_high_cap.........0x8
vl_arb_low_cap..........0x8
mtu_cap.................0x4
vl_stall_life...........0xF2
vl_enforce..............0x40
m_key_violations........0x0
p_key_violations........0x0
q_key_violations........0x0
guid_cap................0x0
subnet_timeout..........0x0
resp_time_value.........0x0
error_threshold.........0x88
Sep 13 09:35:00 337176 [B7713C40] -> Capabilities Mask:
Sep 13 09:35:00 337216 [B7713C40] -> osm_pi_rcv_process_set: Received logical SetResp() for GUID = 0x5442ba00003080, port num = 10
for parent node GUID = 0x5442ba00003080 TID = 0x12c9.
Sep 13 09:35:00 337238 [B7713C40] -> osm_pi_rcv_process_set: ]
Sep 13 09:35:00 337257 [B7713C40] -> osm_pi_rcv_process: ]
Similarly for some other ports (0xC)
Thanks.
-- Hal
More information about the general
mailing list