[ofa-general] [PATCH V3] infiniband/core: Enable loopback of DR SMP responses from userspace

Steve Welch swelch at systemfabricworks.com
Sat Oct 13 10:18:37 PDT 2007


Hi Hal,

> 
> Looks pretty good. A few things below and a couple of nits embedded:
> 
> I think the original description was more detailed and should be added
> to the above:

When I submit the next revision I will update the description to put
the detail back in.

> Signed-off-by: Steve Welch <swelch at systemfabricworks.com>
> 
> My main concern is verifying this with the various HCA drivers
> (Mellanox (in normal HCA mode), iPath, and eHCA) as well as switches
> (Suri, can you try this ?) in addition to running this on a node where
> OpenSM resides (Sasha, can you try this ?). How much of this have you
> done ? Thanks.
> 

Good point, I think we are good with regard to the SM and mthca.
I have run the code with the mthca driver loaded in non-router mode,
and verified proper operation (ports can be brought up, so
process_mad() is handing off SMP requests to the internal SMA,
etc.).  I've also run the SM on that host, again local ports are 
brought up and the SM is able to bring up the attached fabric. Local
user space utilities like smpquery operate normally for local and
remote queries using both directed route and LID routed addressing.

However, I have not run on top of the iPath or eHCA.  A quick code
inspection of the iPath driver indicates that the desired effect
will not be achieved with that driver in every case.  For the
SM info attribute it looks OK and is handled properly currently.
For DR SMP's with the GET_RESPONSE method the iPath driver returns
IB_MAD_RESULT_FAILURE instead of IB_MAD_RESULT_SUCCESS.
This will cause the core mad processing to drop the SMP MAD instead
of attempting to pass it on to a local agent.  Of course this
iPath behavior exists with or without this patch.  I'm not sure
why the iPath driver considers this a failure, it does not
consume or process the MAD in that case, but the MAD has passed
their incoming sanity checks.  The comment in this code indicates
they intended to do the right thing, but are just returning the
wrong status (see ipath_mad.c, process_subn()).

I just don't think this is a code path that has been exercised
on iPath, it requires a user space SMA sendig DR SMP's responses
that must be locally loopbacked.  To get consistent behavior iPath
will need a change, but I do not have the hardware required to
make and test that change.

I'm not sure about the eHca driver, it appears to not implement
the process_mad() IB device function.

> >  }
> > +
> > +/*
> > + * Return IB_SMI_HANDLE if the SMP should be handled by the local
> SMA/SM
> > + * via process_mad
> > + */
> > +static inline enum smi_action smi_check_local_returning_smp(struct
> ib_smp *smp,
> > +                                                  struct ib_device
> *device)
> 
> Nit. Not sure this lines up properly.
> 
The function names are a little verbose and we're pushing 80 columns, so
the second parameter could not line exactly with the first without exceeding
the limit.  I can break the first line up if that is preferred.

Thanks for you feedback,
Steve




More information about the general mailing list