[ofa-general] [PATCH] opensm: Parallelize (Stripe) LFT sets across switches

Hal Rosenstock hal.rosenstock at gmail.com
Thu Jul 23 07:26:12 PDT 2009


On Wed, Jul 22, 2009 at 8:51 PM, Jason
Gunthorpe<jgunthorpe at obsidianresearch.com> wrote:
> On Wed, Jul 22, 2009 at 08:28:25PM -0400, Hal Rosenstock wrote:
>
>> > But you overload the switch the SM is connected to with processing
>> > N*limit DR SMPs rather than just 'limit' SMPs. That is what concerns
>> > me.
>>
>> As I said, the current algorithm is worse as it sends N*no limit DR
>> SMPs (where no limit means any needed blocks). Not sure that VL15
>> droppage due to this has been identified. So I think this improves on
>> what's been deployed and seemingly works in OpenSM for quite some time
>> now.
>
> Hmm, OK I didn't realize that.
>
> I've heard of reports of VL15 droppage in real networks, maybe this is why..

Could be; another possible source is any tools/apps which use VL15
like concurrent diag use (ibnetdiscover, ibdiagnet, many other
infiniband-diags, ....).

Point is that these subnets seem to recover/still work so the retry
must be working.

-- Hal



More information about the general mailing list