[openib-general] First Multicast Leave disconnects all other clients

Eitan Zahavi eitan at mellanox.co.il
Thu Dec 1 07:41:20 PST 2005


Hi Hal,

SRP uses InformInfo to get notification about new or lost ports (trap
64/65) such that new targets are recognized without periodic SA query.
I do not know if that code already found its way to OpenIB. 
I do not think it is relevant to that discussion about missing APIs. 
Maybe to the priority of implementation. But IMO - until we do provide
that missing capabilities we are actually preventing SRP and other ulps
from doing the right thing and causing them to duplicate "Client
Reregistration" handlers and periodic queries .

The bottom line: Do you agree we are missing these API's?
When can we get those done? By whom?

EZ

Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com]
> Sent: Thursday, December 01, 2005 8:20 AM
> To: Eitan Zahavi
> Cc: OPENIB GENERAL; Yael Kalka; Aviram Gutman; Tziporet Koren; Roland
Dreier;
> sean.hefty at intel.com
> Subject: RE: [openib-general] First Multicast Leave disconnects all
other clients
> 
> On Thu, 2005-12-01 at 01:07, Eitan Zahavi wrote:
> > > >
> > > > The bottom line:
> > > > We are missing 3 agents in the OpenIB stack:
> > > > InformInfo - handling registrations and Report dispatching
> > >
> > > These are not currently used.
> > [EZ] They are by SRP initiator.
> 
> Not the SRP initiator in OpenIB svn as far as I can tell.
> 
> > > > ServiceRecord - tracks registrations
> > >
> > > ServiceRecord is implemented in sa_query (and was used by AT/uAT
but
> > > that is largely historical now)
> > >
> > > > Multicast Join/Leave - tracking registrations to multicast
groups
> > and
> > > > ref-counting
> > > >
> > > > All these agents should be able to cleanup dead client
registrations
> > and
> > > > also provide re-registration in case of SM ClientReregistration
> > event.
> > >
> > > In OpenIB, any Set of PortInfo (which includes ClientReregister)
> > > currently causes a (coarse) event (LID change) which causes IPoIB
> > client
> > > to reregister its multicasts registrations with the SA.
> > >
> > > > Please see below
> > > > > >
> > > > > > It seems the IBTA intent was that the IB driver will be
> > responsible
> > > > for maintaining
> > > > > the list of clients
> > > > > > registered to each group.
> > > > >
> > > > > Yes, the end node is responsible for tracking the
registrations
> > within
> > > > > the node and fabricating responses when the node does not want
to
> > > > leave.
> > > > > Is delete a different case though ?
> > > > [EZ] No it is not. Delete of multicast group is really the last
> > leave.
> > >
> > > There is an explicit delete. While it shouldn't be needed to be
> > forced,
> > > there is always some scenario where this is useful.
> > [EZ] To my best knowledge any leave is a "delete" so there is no way
for
> > any client to force other members out of a group. It can only leave
> > itself. The delete will happen when the last will leave.
> 
> Yes, you are right, other than the last full member (join state) rule.
> 
> > > > > > But the IB core does not track what clients registered
(through
> > SA
> > > > requests) to a
> > > > > particular multicast group.
> > > > > > The first client to leave the group causes the rest (of the
> > clients)
> > > > to be disconnected.
> > > > >
> > > > > This is an implementation issue IMO and applies to other
> > subscriptions
> > > > > too (not just limited to multicast).
> > > > [EZ] I agree it is an implementation issue. I hope it will get
> > > > implemented in OpenIB.
> > >
> > > It will. It's a question of priorities and timing.
> > >
> > > > > > My proposal is to provide an API for such registrations at
both
> > user
> > > > and kernel and
> > > > > track the requesting processes.
> > > > > > Cleanup is also required both by process and kernel module
> > > > granularity.
> > > > >
> > > > > Is the API the SA client request itself for this ? Shouldn't
the
> > > > > tracking be done there (within sa_query.c) ?
> > > > [EZ] It will be hard to sniff the MADs (especially user level)
for
> > all
> > > > the registration flows.
> > >
> > > It's not the sniffing which is hard but perhaps identifying which
> > client
> > > (and reference counting).
> > >
> > > > So I propose we should have
> > > >
> >
ib_join/ib_leave/ib_reg_svc/ib_unreg_svc/ib_reg_inform/ib_unreg_inform.
> > > > Both in user land and in kernel.
> > >
> > > I think this is TBD and the API would be discussed on this list
first
> > > prior to any implementation.
> > >
> > > > > > BTW: The same API could also handle "Client Reregistration"
for
> > > > multicast groups,
> > > > >
> > > > > Client reregistration is for all subscriptions (including
> > > > ServiceRecords
> > > > > and events as well).
> > > > [EZ] Yes exactly. I believe similar problem exists for all
> > > > registrations.
> > > > >
> > > > > > such that we could avoid the need to have that code
duplicated
> > by
> > > > every client.
> > > > >
> > > > > I'm missing how client reregistration would help here. Can you
> > > > elaborate
> > > > > ?
> > > > [EZ] It is related to the reference tracking:
> > > > If a kernel module tracks all registrations to refcount them and
> > perform
> > > > cleanup, it could with similar effort also send the -
> > re-registration in
> > > > the event of SM change ...
> > >
> > > Sure, there are multiple ways to skin the same cat.
> > >
> > > > >
> > > > > > But this refers to yet another API that is missing: Report
> > > > dispatching which deserves
> > > > > its own
> > > > > > mail...
> > > > >
> > > > > I'm missing the connection between reregistration and report
> > > > > dispatching.
> > > > [EZ] Sorry for not being verbose. The need for Events dispatcher
is
> > > > based on the fact that only one client should respond to Report
with
> > > > ReportRepress. Reports are "unsolicited" MADs coming into the
> > device. In
> > > > umad the implementation prevents any "multiple" client
registration
> > for
> > > > receiving any "unsolicited" MAD - only one class-agent needs to
be
> > there
> > > > handling "unsolicited" messages. This is fine - but what it
means is
> > > > that when two clients wants to be notified about events they
should
> > > > register with that agent and the agent should be able to
dispatch
> > the
> > > > message to all registered clients as well as send only one
response
> > > > back.
> > >
> > > Wouldn't report represses be reference counted and only actually
sent
> > on
> > > the wire when all subscribed clients within the node indicated
repress
> > ?
> > [EZ] As you say there are many ways to skin a cat. I am not sure we
need
> > to wait for all clients as they are located on the same node and
will be
> > surely notified.
> 
> Right, it just needs to be done once whether it was actually delivered
> to any client, clients, or none at all.
> 
> -- Hal



More information about the general mailing list