[ofa-general] RE: [PATCHv2] OpenSM/osm_trap_rcv.c: Better Trap 131 Handling
Hal Rosenstock
halr at voltaire.com
Tue Jul 10 09:23:51 PDT 2007
Hi Amit,
On Tue, 2007-07-10 at 11:30, Amit Krig wrote:
> Hi Hal,
>
> One comment,
> If one of the port is not responsive for some reason, need to move its
> peer port to DOWN and then check the OPVL,
Guess I'm still not following you exactly yet.
The code here is not determining the port responsiveness. It is merely
triggering off the trap 131, recalculating and resetting OperationalVLs
if needed, and taking the port down at the link level which should start
it back to active, hopefully now with the proper OperationalVLs. If it
is still flooded with trap 131s, it disables the port.
-- Hal
>
> Amit
> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com]
> Sent: Tuesday, July 10, 2007 5:39 PM
> To: general at lists.openfabrics.org
> Cc: Suresh Shelvapille; Amit Krig; Yevgeny Kliteynik; Eitan Zahavi
> Subject: [PATCHv2] OpenSM/osm_trap_rcv.c: Better Trap 131 Handling
>
> OpenSM/osm_trap_rcv.c: Better trap 131 handling
>
> When trap 131 occurs, check operational VLs and set port state to DOWN
> if needed.
>
> I think this is what Amit was saying should be done in his emails
> yesterday on the list (modified by Suri's comment).
>
> Signed-off-by: Hal Rosenstock <halr at voltaire.com>
>
> diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c
> index f912dcd..3f60f3d 100644
> --- a/opensm/opensm/osm_trap_rcv.c
> +++ b/opensm/opensm/osm_trap_rcv.c
> @@ -550,16 +550,76 @@ __osm_trap_rcv_process_request(
> }
> else
> {
> - /* When babbling port policy option is enabled and
> - Threshold for disabling a "babbling" port is exceeded */
> + uint8_t payload[IB_SMP_DATA_SIZE];
> + ib_port_info_t* p_pi = (ib_port_info_t*)payload;
> + const ib_port_info_t* p_old_pi;
> + osm_madw_context_t context;
> +
> + p_old_pi = &p_physp->port_info;
> + memcpy( payload, p_old_pi, sizeof(ib_port_info_t) );
> +
> + if (p_ntci->g_or_v.generic.trap_num == CL_HTON16(131))
> + {
> + uint8_t port_state, cur_opvls, opvls;
> +
> + port_state = ib_port_info_get_port_state(p_old_pi);
> + if (port_state != IB_LINK_DOWN)
> + {
> + /* First, validate OperationalVLs */
> + cur_opvls = ib_port_info_get_op_vls(p_old_pi);
> + opvls = osm_physp_calc_link_op_vls(p_rcv->p_log,
> p_rcv->p_subn, p_physp);
> + if (opvls != cur_opvls)
> + {
> + osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> + "__osm_trap_rcv_process_request: ERR 3809: "
> + "Current OP_VLs %d New OP_VLs %d\n",
> + cur_opvls, opvls);
> + ib_port_info_set_op_vls(p_pi, opvls);
> + }
> +
> + /* Now, set port to DOWN if not already in INIT */
> + if (port_state != IB_LINK_INIT)
> + {
> + ib_port_info_set_port_state( p_pi, IB_LINK_DOWN );
> + ib_port_info_set_port_phys_state(
> IB_PORT_PHYS_STATE_NO_CHANGE, p_pi );
> + }
> + else
> + {
> + ib_port_info_set_port_state( p_pi, IB_LINK_NO_CHANGE );
> + ib_port_info_set_port_phys_state(
> IB_PORT_PHYS_STATE_NO_CHANGE, p_pi );
> + }
> +
> + /* Now, issue set of PortInfo */
> + context.pi_context.node_guid = osm_node_get_node_guid(
> osm_physp_get_node_ptr( p_physp ) );
> + context.pi_context.port_guid = osm_physp_get_port_guid(
> p_physp );
> + context.pi_context.set_method = TRUE;
> + context.pi_context.update_master_sm_base_lid = FALSE;
> + context.pi_context.light_sweep = FALSE;
> + context.pi_context.active_transition = FALSE;
> +
> + status = osm_req_set( &p_rcv->p_subn->p_osm->sm.req,
> + osm_physp_get_dr_path_ptr( p_physp
> ),
> + payload,
> + sizeof(payload),
> + IB_MAD_ATTR_PORT_INFO,
> + cl_hton32(osm_physp_get_port_num(
> p_physp )),
> + CL_DISP_MSGID_NONE,
> + &context );
> +
> + if( status != IB_SUCCESS )
> + {
> + osm_log( p_rcv->p_log, OSM_LOG_ERROR,
> + "__osm_trap_rcv_process_request: ERR 3812: "
> + "Request to set PortInfo failed\n" );
> + }
> + }
> + }
> +
> + /* When babbling port policy option is enabled and
> + Threshold for disabling a "babbling" port is exceeded */
> if ( p_rcv->p_subn->opt.babbling_port_policy &&
> num_received >= 250 )
> {
> - uint8_t payload[IB_SMP_DATA_SIZE];
> - ib_port_info_t* p_pi = (ib_port_info_t*)payload;
> - const ib_port_info_t* p_old_pi;
> - osm_madw_context_t context;
> -
> /* If trap 131, might want to disable peer port if
> available */
> /* but peer port has been observed not to respond to SM
> requests */
>
> @@ -570,9 +630,6 @@ __osm_trap_rcv_process_request(
> p_ntci->data_details.ntc_129_131.port_num
> );
>
> - p_old_pi = &p_physp->port_info;
> - memcpy( payload, p_old_pi, sizeof(ib_port_info_t) );
> -
> /* Set port to disabled/down */
> ib_port_info_set_port_state( p_pi, IB_LINK_DOWN );
> ib_port_info_set_port_phys_state(
> IB_PORT_PHYS_STATE_DISABLED, p_pi );
>
>
>
More information about the general
mailing list