[ofa-general] Re: [RFC] [PATCH 0/3] 2.6.22 or 23 ib: add path record cache

Hal Rosenstock halr at voltaire.com
Wed Apr 25 13:24:01 PDT 2007


On Mon, 2007-04-23 at 17:25, Jason Gunthorpe wrote:
> On Mon, Apr 23, 2007 at 11:20:59PM +0300, Michael S. Tsirkin wrote:
> 
> > I haven't thought this through yet. Basically, I just note that
> > caching the path until GID goes out of service isn't right - since
> > path parameters such as MTU or rate might change without GID going
> > out of service.
> > 
> > So what to do?
> 
> Has anyone thought about using replication rather than caching to
> solve this problem?

Unfortunately, IMO, the IBTA punted on database replication for SA.

>  It seems to me it would be alot faster for some
> single process in the network to fetch and keep a copy of the entire
> SA route database, format it into a binary format and use RC RDMA to
> transfer it to every node each time it changes.

Not sure one can rely on RC RDMA. Not all SMs are built on top of ports
capable of this. I think UD is the only requirement there (switch port
0). One could have a CA based server node intermediary though.

-- Hal

> For say, 10000 nodes you could compact an any-to-any path table into
> around 20 megabytes.
> 
> The RDMA transfers would be arranged into a waterfall, source
> transfers to 8 nodes, who then each transfer to 8, etc. Choosing a
> connection topology that overlays the switch topology would give this
> scheme a huge aggregate bandwidth so the total transfer time would be
> short.
> 
> Unfortunately the SA protocol doesn't seem to have many provisions for
> cache-coherence so it seems any form of route caching is going to run
> into problems with stale data :< Replication adds a coherenece
> mechanism and shifts the problem the replication source, which,
> ideally, would ultimately be tightly connected to the SA.

> > We could use DR SMPs to do network discovery and at least check that
> > paths are valid - it's not too much code (ibnetdiscover is just 800
> > lines) and in a sense, that's actually putting an *SA* (not just
> > cache) in each node.  Combined with GID IN/OUT notices we could get
> > away from querying path records completely.
> 
> I don't think you can find/check the SL like this, plus I doubt the
> little CPUs in the switches can handle that rate of SMPs. :<
> 
> Jason
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list