[ofa-general] Re: [PATCH] opensm/osm_node_info_rcv.c: create physp for the newly discovered port of the known node
Sasha Khapyorsky
sashak at voltaire.com
Wed Feb 18 10:19:55 PST 2009
On 14:41 Tue 17 Feb , Yevgeny Kliteynik wrote:
> Hi Sasha,
>
> This patch fixes bugzilla issue #1515:
>
> Topology:
> |---------------|
> | SW2 |
> |---------------|
> |x |y |z |v
> |----| | | |----|
> | | | |
> | |----| |----| |
> | | | |
> a| b| c| d|
> |---------------| |---------------|
> | SW1 | | SW3 |
> |---------------| |---------------|
> | |
> | |
> HCA with SM HCA
>
> During the discovery:
>
> SM sends NodeInfo request to SW1
> SM sends NodeInfo request to SW2 through link a->x
> SM discovers new node SW2:
> - updates DR to SW2 to go through link a->x
> - creates physp x
> SM sends NodeInfo request to SW2 through link b->y
> SM discovers a known node SW2
> - DOES NOT create physp y
> - updates DR to SW2 to go through link b->y
>
> From now on, the DR to SW2 is going through port y, so OpenSM won't deal with
> port y any more, leaving it uninitialized (no physp object for this port).
>
> The fix is to create physp for the newly discovered port of the known
> switch node, same way as it is done for HCAs.
> I also added one log message for the case that showed the problem - when
> one of the link sides is uninitialized (no valid ports check). Perhaps
> this log message should be an error message instead?
>
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
> opensm/opensm/osm_node_info_rcv.c | 24 +++++++++++++++++++++++-
> 1 files changed, 23 insertions(+), 1 deletions(-)
>
> diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c
> index c52c0d5..7da3103 100644
> --- a/opensm/opensm/osm_node_info_rcv.c
> +++ b/opensm/opensm/osm_node_info_rcv.c
> @@ -164,8 +164,12 @@ __osm_ni_rcv_set_links(IN osm_sm_t * sm,
> */
> if (!osm_node_link_has_valid_ports(p_node, port_num,
> p_neighbor_node,
> - p_ni_context->port_num))
> + p_ni_context->port_num)) {
Actually if port is initialized unconditionally on NodeInfo receiving
this case becomes impossible. No?
If yes, we probably need to put CL_ASSERT() there instead of run-time
check.
Sasha
> + OSM_LOG(sm->p_log, OSM_LOG_DEBUG,
> + "Link at node 0x%" PRIx64 ", port %u - no valid ports\n",
> + cl_ntoh64(osm_node_get_node_guid(p_node)), port_num);
> goto _exit;
> + }
>
> if (osm_node_link_exists(p_node, port_num,
> p_neighbor_node, p_ni_context->port_num)) {
> @@ -537,8 +541,26 @@ __osm_ni_rcv_process_existing_switch(IN osm_sm_t * sm,
> IN osm_node_t * const p_node,
> IN const osm_madw_t * const p_madw)
> {
> +
> + ib_smp_t *p_smp;
> + ib_node_info_t *p_ni;
> + uint8_t port_num;
> +
> OSM_LOG_ENTER(sm->p_log);
>
> + p_smp = osm_madw_get_smp_ptr(p_madw);
> + p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp);
> + port_num = ib_node_info_get_local_port_num(p_ni);
> +
> + if (!osm_node_get_physp_ptr(p_node, port_num)) {
> + OSM_LOG(sm->p_log, OSM_LOG_DEBUG,
> + "Creating physp for node GUID:0x%"
> + PRIx64 ", port %u\n",
> + cl_ntoh64(osm_node_get_node_guid(p_node)),
> + port_num);
> + osm_node_init_physp(p_node, p_madw);
> + }
> +
> /*
> If this switch has already been probed during this sweep,
> then don't bother reprobing it.
> --
> 1.5.1.4
>
More information about the general
mailing list