[openib-general] [RFC] Notice/InformInfo event reporting
Rimmer, Todd
trimmer at silverstorm.com
Mon Oct 16 12:50:21 PDT 2006
> From: Sean Hefty
> Sent: Monday, October 16, 2006 3:29 PM
> To: openib
> Subject: [openib-general] [RFC] Notice/InformInfo event reporting
>
> I'm beginning work on adding InformInfo/Notice event reporting to the
IB
> stack,
> and I'd like any input on potential implementations, as well as
intended
> usage.
>
> Clients use InformInfo to register for events, with registration
tracked
> on a
> per source QP basis. Given this, possible approaches are:
>
> 1. Clients can perform their own registration using their own QPs. If
> several
> clients wish to register for the same event, multiple QPs would be
used.
> Additional traffic would be used when reporting events. But, event
> dispatching
> is centralized to the SA.
>
> 2. A single registration manager can perform all registrations. This
> would
> require reference counting registration requests. At a high level,
the
> behavior
> is similar to what's done for multicast join/leave. This limits use
to a
> single
> QP, and minimizes traffic, but duplicates event dispatching code on
every
> node.
>
> 2a. Using option 2, a registration manager could register to receive
all
> events, then filter based on local registration requests. This would
> prevent
> overlapping requests to the SA, but increase the number of events seen
at
> each
> end node.
>
> 2b. Similar to option 2a, but clients would see all events (possibly
> filtered
> on type only), requiring that they perform additional filtering.
>
> My current thinking is to register for all events, then require that
> clients
> filter unwanted events. (Security events would be filtered from
userspace
> clients.)
My recommendation is option 2.
In large fabrics the SA can be a bottleneck. It is best for an end node
to register with the SA only for the events which are of actual interest
to the end node.
With regards to "duplicating dispatching code on every node", rather
than duplication, think of this as "distributing event dispatching code
among the interested nodes". Thinking of it in these terms makes option
2 stand out as more scalable.
Todd Rimmer
More information about the general
mailing list