[openib-general] [RFC] OpenSM Interactive Console

Hal Rosenstock halr at voltaire.com
Thu Oct 20 07:37:31 PDT 2005


On Thu, 2005-10-20 at 09:53, Troy Benjegerdes wrote:
> > > * Topology
> > 
> > This can be done via SA queries currently.
> > 
> > > * guid/lid/IPoIB address/switch port mappings
> > 
> > The SM does not know (see) IPoIB addresses. The only thing it sees is
> > the part of the subnet address.
> > 
> > The rest can be done via SA queries currently.
> > 
> > > * link state
> > 
> > This can be done via SA query currently.
> > 
> > This argues for a higher layer API to make these queries easy.
> > 
> > > Future neat things to do:
> > > 
> > > * An interface to dynamically partition the fabric
> > 
> > Is this referring to IB partitioning ?
> 
> I think so, but IB partitioning may not actually map to what I'm
> interested in.  From the high-level (application) point of view, I want to
> ensure that communication traffic for one cluster job minimally affects 
> another job.

Do the set of end nodes overlap for jobs ? This might be via using
different SLs rather than different (IB) partitions depending on the
requirement. In any case, there is more work here than just this API.
 
> > > * Register for notifications for certain events (excessive traffic
> > > 		queueing, or error counts)
> > 
> > Not sure what you mean by excessive traffic queuing.
> 
> I guess I'd like to know whenever utilization on a single link exceeds
> 90%, 

or whatever % you would want to be notified about (with sampling/polling
at some interval (assuming there is no IB defined event for these).

> or the queuing delay exceeds XXX nanoseconds.

I think you are talking more in the abstract here. I need to think about
this one some more as to if/how to determine something like this for IB.

> > It is the event set which is of interest to me. Are there others ?
> > 
> > There are a set of events which can be subscribed to currently. The ones
> > along these lines are local link integrity threshold reached on a port,
> > excessive buffer overrun threshold reached on a port, flow control and
> > update watchdog timer expired on a switch port.
> > 
> > If you are referring to the PortCounters, these would need to be polled
> > (at some periodicity) and then an event created as there is no event for
> > this defined in IBA.
> > 
> > Higher layer APIs could help with this area too.
> 
> Some of this stuff may not necessarily belong in the OpenSM process either..
> Stuff like getting IPoIB address from GUID's would be usefull in a
> library, but isn't the SM's responsibility.

There are a couple of approaches I can imagine for obtaining the
mappings of GUID to IPoIB address(es).

1. Vendor specific MADs could be implemented for this but this is ugly.
Interaction would be required to register and unregister each IPoIB
address with the vendor specific agent for this.

2. OpenSM node needs to be on either all IPoIB subnets or those of
"interest". It could then do the equivalent of a broadcast ping on each
IPoIB subnet and match the ARP/neighbor entries with the GUID requested.
Note that the same GUID can have multiple IP addresses on the same or
different subnets.

A RARP based approach won't work as the QPN is also part of the IPoIB
hardware address.

-- Hal




More information about the general mailing list