[openib-general] [PATCH] IB/ipoib: use appropriate path selector

Michael S. Tsirkin mst at mellanox.co.il
Thu Sep 14 04:14:03 PDT 2006


Quoting r. Hal Rosenstock <halr at voltaire.com>:
> Subject: Re: [PATCH] IB/ipoib: use appropriate path selector
> 
> On Thu, 2006-09-14 at 00:46, Michael S. Tsirkin wrote:
> > Quoting r. Hal Rosenstock <halr at voltaire.com>:
> > > Subject: Re: [PATCH] IB/ipoib: use appropriate path selector
> > > 
> > > On Wed, 2006-09-13 at 18:08, Michael S. Tsirkin wrote:
> > > > Quoting r. Roland Dreier <rdreier at cisco.com>:
> > > > > Subject: Re: [PATCH] IB/ipoib: use appropriate path selector
> > > > > 
> > > > >     Michael> IPoIB in linux needs 2K MTU. Therefore it must set mtu
> > > > >     Michael> selector in path record query accordingly.
> > > > > 
> > > > > Umm -- why does it need a 2K MTU?  As far as I know it should work
> > > > > fine with any MTU, assuming the SA sets the MTU of the broadcast
> > > > > multicast group correctly.
> > > > 
> > > > Hmm, you are right, it is just that existing implementations all
> > > > set that to 2K.
> > > 
> > > By default yes. It can be configured.
> > > 
> > > > But there is a silent assumption that MTU of any path is >= broadcast
> > > > multicast group MTU, and this is what I want to fix.
> > > 
> > > The spec says:
> > > "The value (for IB MTU) assigned to the broadcast-GID must not be
> > > greater than any physical link MTU spanned by the IPoIB subnet".
> > > so if the broadcast group is improperly setup not to follow this, there
> > > will be other issues.
> > 
> > Correct. IPoIB uses broadcast group MTU to get the value reported to
> > Linux. If some link has a lower MTU IPoIB can not use it.
> > 
> > > It doesn't need to be included in the PR request.
> > 
> > I disagree here. If you do not set selector, SA is free to return
> > a path with lower MTU even though physical link allows higher MTU.
> > Does it say otherwise somewhere?
> 
> No but isn't this relying on using PRs in a certain way by IPoIB
> implementations (and any other UD application) v. connected apps ?

Not really.

Tavor is faster with 1K MTU than with 2K MTU - it does not matter connected or
not. So, for me, it makes sense for SM to choose 1K if Tavor is involved,
unless application requested otherwise.

If an application (again, no matter connected or UD) needs a specific MTU it
should use mtu selector in path query. If it does not, SM is free to choose any
MTU supported by link, for best performance. If one end is Tavor, this happens to
be 1K and not the maximum MTU.

So what we have here is IPoIB bug - it requires that path mtu >= bcast group
mtu, but does not pass this information in query. This only happens to work
if SM always selects max link MTU for each path query.

Makes sense?

-- 
MST




More information about the general mailing list