[Openib-windows] RE: [PATCH] client reregister

Fab Tillier ftillier at silverstorm.com
Mon Sep 12 13:00:28 PDT 2005


> From: Yossi Leybovich [mailto:sleybo at mellanox.co.il]
> Sent: Monday, September 12, 2005 10:20 AM
> 
> Fab
>
> Attach is patch for supporting client reregister event in the HCA , PnP
> manager and in the IPoIB I followed your recommendation to implement
> reregistration just in the level of IPoIB till you will have the full design
> to the mcast/service_record mngr in the AL
>
> Main changes:
> 1. for the HCA async event I add the support to parse the event, but I didn't
> have the chance to check that it actually work
> (I encounter problem to get async events from the HCA  , I don't get the cq
> async events as well. I will debug that later thus week) The code just add
> defines and cases and I think t safe to add it.

This code looks fine to me.  If you want, we can split this patch into three
parts - adding reregister support to IBAL, to the HCA driver, and to IPoIB.
This might help keep the discussions more focused.  It's up to you, though.

> 2. In PnP manager I trapped the port_info and set flag to signal that client
> reregister was set by the SM .

Is there any difference in client behavior between SM_CHANGE and
CLIENT_REREGISTER?  If not, can we use SM_CHANGE for both?  In either case, the
SM has lost all information about the end node.  Is the only difference that the
SM LID didn't change?  If so, rather than introducing an internal field in a
public structure, why not set the SM LID to zero in the PnP manager's cached
attributes?  This would cause the poll operation to detect the change as an
SM_CHANGE, thus collapsing the two events into one.

> 3. In the IPoIB after getting the PNP_CLIENT_REREGISTER event I remove the
> bcast entry and issue get_bcast again. After the IpoIB join the bcast group it
> register the IP address and refresh the mcast array (as it do in the
> PNP_ACTIVE).

Since neither the mcast or service registrations where flushed, refresh_mcast
and reg_addrs will do nothing.  We need to add support in IBAL to reregister
these given an existing object handle, or add code to IPoIB to first deregister
and then reregister.  I would prefer to add this to IBAL.

> Of course that wont be needed after we will implement the
> mcast/service_records reregistration mngr in the AL (BTW did you find the time
> to put some initial design draft)

An alternative to these changes in IPoIB is to do something similar to what the
Linux driver does - down the interface and then up it again.  The easiest way to
do this is to punt on NDIS and set the adapter's hung flag.  This flag will
cause NDIS to reset the adapter, which will re-register everything properly.

This is probably the simplest solution, albeit a little crude.  It will get the
work done, though, and we can refine it later.

> 4. few cosmetics changes in the TRACES and the functions name.

Thanks for doing this.  I've been updating the debug macros used throughout the
IBAL code base as I touch functions.

>
> Pls review and send me your comments.

Some more comments inline.

Thanks,

- Fab

> 
> Singned-off-by:Yossi Leybovich (sleybo at mellanox.co.il)
> Index: core/al/al_init.c
> ===================================================================
> --- core/al/al_init.c   (revision 355)
> +++ core/al/al_init.c   (working copy)
> @@ -39,7 +39,7 @@
> 
> 
> 
> -uint32_t                               g_al_dbg_lvl = AL_DBG_ERROR;
> +uint32_t                               g_al_dbg_lvl = AL_DBG_ERROR |
> AL_DBG_SMI;

The debug level should stay as AL_DBG_ERROR in SVN.  Note that you can set the
debug level via the registry, as a key under the ibal service parameters - this
value will be set when the driver loads.  This might make it easier to test
without having to recompile when changing debug levels.  You can also change the
debug level via the kernel debugger.

> Index: core/al/kernel/al_smi.c
> ===================================================================
> --- core/al/kernel/al_smi.c     (revision 355)
> +++ core/al/kernel/al_smi.c     (working copy)
> @@ -1810,8 +1810,14 @@
> 
>                         if( p_port_info )
>                         {
> +
>                                 p_spl_qp_svc->base_lid = p_port_info-
> >base_lid;
>                                 p_spl_qp_svc->lmc =
> ib_port_info_get_lmc( p_port_info );
> +                               if (p_port_info->subnet_timeout & 0x80)
> +                               {
> +                                       AL_TRACE(AL_DBG_PNP,(" setting the
> client reregister event \n"));
> +                                       p_spl_qp_svc->obj.p_ci_ca->p_pnp_attr-
> >p_port_attr->client_reregistration = 1;
> +                               }

Access to the port attributes needs to be synchronized with the PnP manager
through a call to ci_ca_lock_attr and ci_ca_unlock_attr.  Can we just set the SM
LID to zero in the PnP manager's cached attributes?  This would result in
existing clients potentially seeing a SM LID of zero, but they would receive a
SM_CHANGE event pretty quickly updating them with the new value.  This would
avoid adding an internal flag to a public structure.  Of course, this would only
work if we reuse the SM_CHANGE event rather than define a new one.
 
>                         }
>                 }
>         }
> Index: hw/mt23108/kernel/hca_verbs.c
> ===================================================================
> --- hw/mt23108/kernel/hca_verbs.c       (revision 355)
> +++ hw/mt23108/kernel/hca_verbs.c       (working copy)
> @@ -414,6 +414,10 @@
>                 hca_attr_mask |= HCA_ATTR_IS_VENDOR_CLS_SUP;
>                 hca_attr.is_vendor_cls_sup = (MT_bool)p_port_attr->cap.vend;
>         }
> +       if (modca_cmd & IB_CA_MOD_IS_CLIENT_REREGISTER_SUPPORTED) {
> +               hca_attr_mask |= HCA_ATTR_IS_CLIENT_REREGISTRATION_SUP;
> +               hca_attr.is_client_reregister_sup= (MT_bool)p_port_attr-
> >cap.client_reregister;
> +       }

Do we want to let users modify this?  It seems that making such a change would
have global effects on unsuspecting clients.  I would rather not support
disabling the client reregister flag as I don't see it providing any value
except potentially for DoS type of activities.  This falls into a larger port
capability management issue that we should tackle at some point.

>         if (modca_cmd & IB_CA_MOD_QKEY_CTR) {
>                 if (p_port_attr->qkey_ctr == 0)
>                         hca_attr.reset_qkey_counter = TRUE;
> Index: inc/iba/ib_types.h
> ===================================================================
> --- inc/iba/ib_types.h  (revision 355)
> +++ inc/iba/ib_types.h  (working copy)
> @@ -3839,38 +3839,38 @@
>  #define IB_PORT_LINK_SPEED_SUPPORTED_MASK                      0xF0
> 
> 
> -#define IB_PORT_CAP_RESV1                      (CL_NTOH32(0x00000001))
> -#define IB_PORT_CAP_IS_SM                      (CL_NTOH32(0x00000002))
> -#define IB_PORT_CAP_HAS_NOTICE         (CL_NTOH32(0x00000004))

<snip>

> +#define IB_PORT_CAP_RESV0         (CL_NTOH32(0x00000001))
> +#define IB_PORT_CAP_IS_SM         (CL_NTOH32(0x00000002))
> +#define IB_PORT_CAP_HAS_NOTICE    (CL_NTOH32(0x00000004))

<snip>

Please avoid converting tabs to spaces if you can.  The tab width throughout the
code is set to 4, in case your editor is showing things funny.

> @@ -7177,6 +7177,7 @@
>         IB_AE_WQ_ACCESS_ERROR,
>         IB_AE_PORT_ACTIVE,
>         IB_AE_PORT_DOWN,
> +       IB_AE_CLIENT_REREGISTER,
>         IB_AE_UNKNOWN           /* ALWAYS LAST ENUM VALUE */

This event should be trapped in al_ci_ca_shared.c, in the function
ci_ca_process_event_cb and processed like other port events (~line 227).

> 
>  }      ib_async_event_t;
> @@ -9279,6 +9288,10 @@
>  *
>  *      IB_CA_MOD_SYSTEM_IMAGE_GUID
>  *              Used to modify the system image GUID for the port.
> +*
> +*      IB_CA_MOD_IS_CLIENT_REREGISTER_SUPPORTED
> +*              Used to modify the system image GUID for the port.

Minor nit - the comment is wrong.  Goes away if we disallow changing the client
reregister support bit.

> +*
>  *****/
> 
> 
> Index: ulp/ipoib/kernel/ipoib_adapter.c
> ===================================================================
> --- ulp/ipoib/kernel/ipoib_adapter.c    (revision 355)
> +++ ulp/ipoib/kernel/ipoib_adapter.c    (working copy)
> @@ -500,6 +500,7 @@
>                 return IB_NOT_DONE;
>         }
> 
> +       IPOIB_TRACE(IPOIB_DBG_INFO,("p_pnp_rec->pnp_event =
> +0x%x\n",p_pnp_rec->pnp_event));
>         switch( p_pnp_rec->pnp_event )
>         {
>         case IB_PNP_PORT_ADD:
> @@ -614,7 +615,27 @@
>                 IPOIB_TRACE( IPOIB_DBG_INFO,
>                         ("IPOIB: Received unhandled PnP event %d\n",
>                         p_pnp_rec->pnp_event) );
> +               status = IB_SUCCESS;
> +               break;
> +
> +       case IB_PNP_CLIENT_REREGISTER:
> +               /* Join multicast groups and put QP in RTS. */
> +               CL_ASSERT( p_pnp_rec->context );
> +               cl_obj_lock( &p_adapter->obj );
> +               p_adapter->state = IB_PNP_CLIENT_REREGISTER;
> +               cl_obj_unlock( &p_adapter->obj );
> +               IPOIB_TRACE( IPOIB_DBG_INFO,
> +                       ("IPOIB: Received client reregister PnP event %d\n",
> +                       p_pnp_rec->pnp_event) );
> +
> +               ipoib_port_client_reregister( p_adapter->p_port);
> +
> +               status = IB_SUCCESS;
> +
>         default:
> +               IPOIB_TRACE( IPOIB_DBG_INFO,
> +                       ("IPOIB: Received unhandled PnP event 0x%x\n",
> +                       p_pnp_rec->pnp_event) );

Let's also re-register in case of IB_PNP_SM_CHANGE.

>                 status = IB_SUCCESS;
>                 break;
>         }
> @@ -855,7 +876,20 @@
>         case IB_PNP_PORT_REMOVE:
>                 ipoib_resume_oids( p_adapter );
>                 break;
> +       case IB_PNP_CLIENT_REREGISTER:
> 
> +               /* Join all programmed multicast groups. */
> +               ipoib_refresh_mcast(p_adapter, p_adapter->mcast_array,
> +p_adapter->mcast_array_size);

This won't re-join the groups because they will already have an endpoint.  You
either have to flush those endpoints just like you did with the broadcast group,
or we need to find a way to re-register the existing groups.  Flushing the
endpoints will cause packets to be dropped.

Should we be reporting a cable disconnected event from the time we detect the SM
has changed to the time everything has been restored?

> +
> +               /* Register all existing addresses. */
> +               ipoib_reg_addrs( p_adapter );

This won't do anything either since as far as IPoIB is concerned, there is still
a service registration.  I'll add a call to reregister a service to handle this.
 




More information about the ofw mailing list