[ofa-general] Re: [Query] ib add path record cache
Devesh Sharma
devesh28 at gmail.com
Wed May 23 07:27:55 PDT 2007
On 21 May 2007 13:52:11 -0400, Hal Rosenstock <halr at voltaire.com> wrote:
> On Mon, 2007-05-21 at 01:58, Devesh Sharma wrote:
> > On 18 May 2007 06:21:05 -0400, Hal Rosenstock <halr at voltaire.com> wrote:
> > > On Thu, 2007-05-17 at 08:28, Devesh Sharma wrote:
> > > > On 17 May 2007 06:42:16 -0400, Hal Rosenstock <halr at voltaire.com> wrote:
> > > > > On Thu, 2007-05-17 at 01:21, Devesh Sharma wrote:
> > > > > > On 5/17/07, Sean Hefty <mshefty at ichips.intel.com> wrote:
> > > > > > > > But initially this will generate a packet for each path, while sys
> > > > > > > > admin knows that path is there and he can hard-code the entries for
> > > > > > > > it. Other thing is that why Admin will care about creating such record
> > > > > > > > while SA is itself taking care, right?
> > > > > > >
> > > > > > > In your original message you asked about adding 'dummy entries' to the
> > > > > > > cache. I agree that pre-loading the cache can be useful. What I still
> > > > > > > am not understanding is the reasoning for adding 'dummy entries'. By
> > > > > > > 'dummy entries', I've been assuming that these are invalid path records,
> > > > > > > but maybe that's not what you meant.
> > > > > > Ok if "dummy entries" word as such has created confusion then I am
> > > > > > sorry for that, But with that I mean that, those are valid path
> > > > > > records which Administrator knows in advance and while loading the
> > > > > > module,
> > > > >
> > > > > How does the admin know they are valid ?
> > > > Depending on the initial application runs, some trusted PRs can be generated.
> > >
> > > What do initial application runs have to do with this ?
> > My understanding is that, once the cluster is UP, and if between Node
> > A and Node B there is only one path,
>
> So this is a feature for such one path subnets. I wonder what percentage
> of deployed subnets fits this case.
You never know, It may be used for debugging also.
>
> > then, SA query always going to return same values in PR.
>
> If subnet topology is changed, these PRs might change. There are other
> cases where they change too.
Not sure about it...some suggestion?
>
> > On this basis Initial application runs will generate PRs,
>
> That's what confused me before (Applications don't generate PRs but
> rather request them.) but I think I see what you mean now.
Ok
>
> > these PRs can be saved in some file, and can be loaded
> > when cache_module comes in.
> > >
> > > > >Are they somehow preconfigured at the SM ?
> > > > I am not sure about SM has any such provision?
> > >
> > > Not that I'm aware of.
> > Ok, So, currently no such support is there in SM?
>
> I can speak definitively for OpenSM and there is no such support. As to
> the vendor SMs, I don't think so but don't know for absolute certainty.
> Someone can correct me if I'm wrong but I wouldn't assume no response
> means correctness as some may not be listening nor want to respond as to
> "value added" vendor specific features.
What is the issue if OpenSM provides this?
>
> > > > Also not sure about the
> > > > role of SM in path resolving. I mean once node has initiated SA query,
> > > > whether SM has some database to reply SA or On the fly destination
> > > > node is contacted to get asked path recored?
> > >
> > > SMs can either calculate the SA PRs on the fly based on the routing
> > > algorithm in use and some other things or put them in a local database.
> > > This is up to that SM.
> > Ok
> > >
> > > Destination node is not contacted in the SA PR query process.
> > >
> > > > >Doesn't each SM have its own policy for generating valid PRs ?
> > > > Ultimately path record is in Path_Record object format, and SA cache
> > > > is going to store in a fixed manner, How generation policy matters?
> > >
> > > What if the local policy loaded does not agree with what the SM would
> > > generate for a particular PR ? One then gets a local error which will
> > > need to be tracked down. Not so easy IMO.
> > SM policies in a subnet to generate PRs, changes dynamically? at run time?
>
> The policy doesn't change dynamically but the data to be returned in the
> SA PR response might.
>
> > if Not then depending on the local SM policy static PR can be
> > generated to load initially.
>
> Just as one question related to this, how would link failures be handled
> ? There are others.
Its just a matter of avoiding initial PR query packets by loading the
cache with static PRs.....Later on cache module will function in
normal fashion. I expect, initially every thing will come up in a
trusted cluster.
>
> > > > CMIIW. Also I am assuming a homogeneous cluster where certain
> > > > parameters can be assumed to be same always.
> > >
> > > and always in agreement with what the SM would return ? For example,
> > yes
> > > what happens when a link goes down and the end node is no longer
> > > reachable ?
> > If node is not reachable then, after first timeout of sa_cache, that
> > entry will be removed from cache.
>
> OK; that's another aspect to add into this feature. I don't think that
> is currently done. I think there would need to be an API added to do
> this.
Yes, this has been discussed with Sean, we can add one char_dev
interface to the existing sa_cache module implementation, Write entry
point will generate a SA_PR_response packet and this packet will be
passed to update_cache() function.
Also we need to remove the initial schedule_update() call in the
add_one() function.
One user command is also required to read from user file and write
onto this device.
>
> -- Hal
>
> > > > >are these from a live SM and just loaded "out of band" to
> > > > bypass/preclude the SA PR >mechanism ?
> > > > may be
> > >
> > > Even if they are, there is still the changes in the subnet issue.
> > >
> > > -- Hal
> > >
> > > > > -- Hal
> > > > >
> > > > > > Admin is loading this info in the cache with user command.
> > > > > > >
> > > > > > > > Another point I want to know is,
> > > > > > > > When local_sa_cache module will be inserted? After SM comes up or
> > > > > > > > Before SM comes up?
> > > > > > >
> > > > > > > It can occur either way. There is no restriction. The cache responds
> > > > > > > to port up and GID in/out of service events to update itself.
> > > > > > Do you mean cache module will start building cache only after Port is UP?
> > > > > > >
> > > > > > > > If Its inserted before SM is coming up (I am assuming SM is running on
> > > > > > > > some node not on switch) then First Forced schedule_update() is
> > > > > > > > waisted, and for the first application presence of cache is
> > > > > > > > meaningless. Why not to keep cache effective right from the start?
> > > > > > >
> > > > > > > Pre-loading the cache with path records doesn't guarantee that those
> > > > > > > paths are usable. If the SM has not come up, then the path records will
> > > > > > > be unusable until the SM configures the subnet, plus there's no
> > > > > > > guarantee that the remote endpoints specified by the paths are running.
> > > > > > You mean there is no guarantee that even if SM is UP and we have some
> > > > > > hard coded entries of path record corresponding to some node X, we are
> > > > > > not sure that node X has actually come up or not? In that case
> > > > > > actually that path resolving should fail if node has not come up, but
> > > > > > with the hard coding still path will be resolved?
> > > > > > >
> > > > > > > The main benefit I see to pre-loading the cache is to avoid SA storms
> > > > > > > when booting a large cluster.
> > > > > > that's true. Also cache will get valid entries only if network is
> > > > > > configured by SM otherwise every node SA will, possibly, drop SA
> > > > > > packets.
> > > > > > >
> > > > > > > - Sean
> > > > > > >
> > > > > > _______________________________________________
> > > > > > general mailing list
> > > > > > general at lists.openfabrics.org
> > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > > > >
> > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > > > >
> > > > >
> > >
> > >
>
>
More information about the general
mailing list