[ofa-general] [PATCH V3] infiniband/core: Enable loopback ofDR SMP responses from userspace

Suresh Shelvapille suri at baymicrosystems.com
Mon Oct 15 09:19:25 PDT 2007


>From a code review it looked OK as far as a switch implementation was 
concerned.

Let me apply the patch and try once before we commit this, please give me a few days,
Thanks,
Suri

> -----Original Message-----
> From: Hal Rosenstock [mailto:hrosenstock at xsigo.com]
> Sent: Sunday, October 14, 2007 7:43 AM
> To: Steve Welch
> Cc: 'Hal Rosenstock'; rdreier at cisco.com; general at lists.openfabrics.org; suri at baymicrosystems.com;
> ralph.campbell at qlogic.com
> Subject: RE: [ofa-general] [PATCH V3] infiniband/core: Enable loopback ofDR SMP responses from
> userspace
> 
> Hi Steve,
> 
> On Sat, 2007-10-13 at 12:18 -0500, Steve Welch wrote:
> > Hi Hal,
> >
> > >
> > > Looks pretty good. A few things below and a couple of nits embedded:
> > >
> > > I think the original description was more detailed and should be added
> > > to the above:
> >
> > When I submit the next revision I will update the description to put
> > the detail back in.
> 
> Thanks.
> 
> > > Signed-off-by: Steve Welch <swelch at systemfabricworks.com>
> > >
> > > My main concern is verifying this with the various HCA drivers
> > > (Mellanox (in normal HCA mode), iPath, and eHCA) as well as switches
> > > (Suri, can you try this ?) in addition to running this on a node where
> > > OpenSM resides (Sasha, can you try this ?). How much of this have you
> > > done ? Thanks.
> > >
> >
> > Good point, I think we are good with regard to the SM and mthca.
> > I have run the code with the mthca driver loaded in non-router mode,
> > and verified proper operation (ports can be brought up, so
> > process_mad() is handing off SMP requests to the internal SMA,
> > etc.).  I've also run the SM on that host, again local ports are
> > brought up and the SM is able to bring up the attached fabric. Local
> > user space utilities like smpquery operate normally for local and
> > remote queries using both directed route and LID routed addressing.
> >
> > However, I have not run on top of the iPath or eHCA.
> 
> I don't think this currently is utilized by eHCA as all this is done in
> firmware but there is at least one known switch implementation out there
> which should IMO be reverified with this change.
> 
> >   A quick code
> > inspection of the iPath driver indicates that the desired effect
> > will not be achieved with that driver in every case.  For the
> > SM info attribute it looks OK and is handled properly currently.
> > For DR SMP's with the GET_RESPONSE method the iPath driver returns
> > IB_MAD_RESULT_FAILURE instead of IB_MAD_RESULT_SUCCESS.
> > This will cause the core mad processing to drop the SMP MAD instead
> > of attempting to pass it on to a local agent.  Of course this
> > iPath behavior exists with or without this patch.  I'm not sure
> > why the iPath driver considers this a failure, it does not
> > consume or process the MAD in that case, but the MAD has passed
> > their incoming sanity checks.  The comment in this code indicates
> > they intended to do the right thing, but are just returning the
> > wrong status (see ipath_mad.c, process_subn()).
> 
> I don't know either but that could be a separate patch. Maybe Ralph
> could comment on this.
> 
> > I just don't think this is a code path that has been exercised
> > on iPath, it requires a user space SMA sendig DR SMP's responses
> > that must be locally loopbacked.  To get consistent behavior iPath
> > will need a change, but I do not have the hardware required to
> > make and test that change.
> >
> > I'm not sure about the eHca driver, it appears to not implement
> > the process_mad() IB device function.
> 
> Right; it currently does not expose QP0. It is all done in firmware.
> 
> > > >  }
> > > > +
> > > > +/*
> > > > + * Return IB_SMI_HANDLE if the SMP should be handled by the local
> > > SMA/SM
> > > > + * via process_mad
> > > > + */
> > > > +static inline enum smi_action smi_check_local_returning_smp(struct
> > > ib_smp *smp,
> > > > +                                                  struct ib_device
> > > *device)
> > >
> > > Nit. Not sure this lines up properly.
> > >
> > The function names are a little verbose and we're pushing 80 columns, so
> > the second parameter could not line exactly with the first without exceeding
> > the limit.  I can break the first line up if that is preferred.
> 
> I agree they are verbose but I think that makes them clearer. Maybe they
> can be shortened: Just make their names is_local_outgoing/returning_smp,
> perhaps ?
> 
> -- Hal
> 
> > Thanks for you feedback,
> > Steve
> >
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list