[ofa-general] Re: IPoIB path caching
Sean Hefty
mshefty at ichips.intel.com
Tue Jul 24 09:29:12 PDT 2007
> Linux has a quite sophisticated mechanism to maintain / cache / probe /
> invalidate / update the network stack L2 neighbour info.
Path records are not just L2 info. They contain L4, L3, and L2 info
together.
> For example, in the Voltaire gen1 stack we had an ib arp module which
> was used by both IPoIB and native IB ULPs (SDP, iSER, Lustre, etc). This
> module managed some sort of path cache, were IPoIB was always asking for
> non-cached path and other ULPs were willing to get cached path.
IMO, using a cached AH is no different than using a cached path. You're
simply mapping the PR data into another structure.
We're ignoring the problem here, and that is that a centralized SA
doesn't scale. MPI stacks have largely ignored this problem by simply
not doing path record queries. Path information is often hard-coded,
with QPN data exchanged out of band over sockets (often over Ethernet).
We've seen problems running large MPI jobs without PR caching. I know
that Silverstorm/QLogic did as well. And apparently Voltaire hit the
same type of problem, since you added a caching module. (Did Mellanox
and Topspin/Cisco create PR caches as well?) At least three companies
working on IB came up with the same solution. What is the objection to
the current patch set?
- Sean
More information about the general
mailing list