[ofa-general] [PATCH] opensm: Parallelize (Stripe) LFT sets across switches

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Wed Jul 22 17:51:59 PDT 2009


On Wed, Jul 22, 2009 at 08:28:25PM -0400, Hal Rosenstock wrote:

> > But you overload the switch the SM is connected to with processing
> > N*limit DR SMPs rather than just 'limit' SMPs. That is what concerns
> > me.
> 
> As I said, the current algorithm is worse as it sends N*no limit DR
> SMPs (where no limit means any needed blocks). Not sure that VL15
> droppage due to this has been identified. So I think this improves on
> what's been deployed and seemingly works in OpenSM for quite some time
> now.

Hmm, OK I didn't realize that.

I've heard of reports of VL15 droppage in real networks, maybe this is why..

> > I first implemented an algorithm like this for switches based on Gamla
> > chips, and then for Anafa. If something doesn't support it, it is
> > very uncommon.
> 
> I'm aware of at least two very different switches where this is the case.

Well, that's horrible - but again, I personally have a hard time
caring if using LID routing gives even a 5% reduction in setup time
with compliant devices.

I suppose if you really cared it would be asy to black list certain
devices.

> Understood; that's what I meant when I wrote below that it's harder
> and more expensive computationally. I think that it's also overly
> pessimistic so the number might want to be made artificially higher
> based on experience that these SMPs can be pipelined quite a bit more
> than this would allow.

That is highly device dependent - some devices have more CPU SMP
buffering than others and this affects things greatly.

Jason



More information about the general mailing list