[ofa-general] [PATCH] opensm: Parallelize (Stripe) LFT sets across switches
Jason Gunthorpe
jgunthorpe at obsidianresearch.com
Wed Jul 22 17:51:59 PDT 2009
On Wed, Jul 22, 2009 at 08:28:25PM -0400, Hal Rosenstock wrote:
> > But you overload the switch the SM is connected to with processing
> > N*limit DR SMPs rather than just 'limit' SMPs. That is what concerns
> > me.
>
> As I said, the current algorithm is worse as it sends N*no limit DR
> SMPs (where no limit means any needed blocks). Not sure that VL15
> droppage due to this has been identified. So I think this improves on
> what's been deployed and seemingly works in OpenSM for quite some time
> now.
Hmm, OK I didn't realize that.
I've heard of reports of VL15 droppage in real networks, maybe this is why..
> > I first implemented an algorithm like this for switches based on Gamla
> > chips, and then for Anafa. If something doesn't support it, it is
> > very uncommon.
>
> I'm aware of at least two very different switches where this is the case.
Well, that's horrible - but again, I personally have a hard time
caring if using LID routing gives even a 5% reduction in setup time
with compliant devices.
I suppose if you really cared it would be asy to black list certain
devices.
> Understood; that's what I meant when I wrote below that it's harder
> and more expensive computationally. I think that it's also overly
> pessimistic so the number might want to be made artificially higher
> based on experience that these SMPs can be pipelined quite a bit more
> than this would allow.
That is highly device dependent - some devices have more CPU SMP
buffering than others and this affects things greatly.
Jason
More information about the general
mailing list