[openib-general] opensm errors with ehca
Troy Benjegerdes
hozer at hozed.org
Tue Nov 1 20:49:18 PST 2005
> Can you try the following opensm patch and see if this eliminates those
> timeout messages ?
>
> This patch clears the high part of the attribute modifier when not a
> switch (when obtaining the PKeyTable).
>
> -- Hal
>
> Index: osm_port_info_rcv.c
> ===================================================================
> --- osm_port_info_rcv.c (revision 3906)
> +++ osm_port_info_rcv.c (working copy)
> @@ -430,6 +430,7 @@ void osm_pkey_get_tables(
> osm_dr_path_t path;
> uint8_t port_num;
> uint16_t block_num, max_blocks;
> + uint32_t attr_mod_ho;
> osm_switch_t* p_switch;
>
> OSM_LOG_ENTER( p_log, osm_physp_has_pkey );
> @@ -455,7 +456,7 @@ void osm_pkey_get_tables(
> else
> {
> /* This is a switch, and not a management port. The maximum blocks is defined
> - on the switch info partition enforcement cap. */
> + in the switch info partition enforcement cap. */
> p_switch = osm_get_switch_by_guid(p_subn, p_node->node_info.node_guid);
>
> if (! p_switch)
> @@ -472,10 +473,14 @@ void osm_pkey_get_tables(
>
> for (block_num = 0 ; block_num < max_blocks ; block_num++)
> {
> + if (osm_node_get_type( p_node ) != IB_NODE_TYPE_SWITCH)
> + attr_mod_ho = block_num;
> + else
> + attr_mod_ho = block_num | (port_num << 16);
> status = osm_req_get( p_req,
> &path,
> IB_MAD_ATTR_P_KEY_TABLE,
> - cl_hton32(block_num | (port_num << 16) ),
> + cl_hton32(attr_mod_ho),
> CL_DISP_MSGID_NONE,
> &context );
>
This seems to ignore the IBM logical HCA, but gives the same thing
on the IBM Logical switch. Is there a way to ignore this as well?
switchguids=0x2550000038580
Switch 63 "S-0002550000038580" # IBM Logical Switch 1 port 0
lid 21
[2] "H-0002550000038500"[1]
[1] "S-0002c90200402917"[22]
I still get:
Nov 01 22:34:08 660205 [43005960] -> umad_receiver: ERR 5409: send
completed wit
h error (method=0x1 attr=0x16 trans_id=0x13c9) -- dropping.
Nov 01 22:34:08 660213 [43005960] -> umad_receiver: ERR 5411: DR SMP hop
ptr 0 h
op count 2 DR SLID 0x0 DR DLID 0x0
Nov 01 22:34:08 660221 [43005960] -> __osm_sm_mad_ctrl_send_err_cb: ERR
3113: MA
D completed in error (IB_TIMEOUT).
Nov 01 22:34:08 660243 [43005960] -> SMP dump:
base_ver................0x1
mgmt_class..............0x81
class_ver...............0x1
method..................0x1 (SubnGet)
D bit...................0x0
status..................0x0
hop_ptr.................0x0
hop_count...............0x2
trans_id................0x13c9
attr_id.................0x16
(P_KeyTable)
resv....................0x0
attr_mod................0x10000
m_key...................0x0000000000000000
dr_slid.................0xFFFF
dr_dlid.................0xFFFF
Initial path: [0][1][16]
Return path: [0][0][0]
Reserved: [0][0][0][0][0][0][0]
More information about the general
mailing list