[ewg] Review of OFA Logo NFS/RDMA testing

Hal Rosenstock hal at dev.mellanox.co.il
Fri Apr 11 08:05:39 PDT 2014


On 4/11/2014 10:44 AM, Richard Croucher wrote:
> The proposal did not look like it was mixing SM's. 

I was commenting on the following:
"Previously we were executing our InfiniBand tests with OpenSM as the
Master SM, then disabling OpenSM and turning on the Subnet Managers that
are included in our InfiniBand switches. This was meant to expose any
interoperability issues with, say, a Subnet Manager on an Intel switch
controlling a fabric comprised of Mellanox and Intel HCAs. We have since
scaled back and only require a device to be interoperable when using
OpenSM."

and the context of my response was changing SMs "on the fly" rather than
a subnet "restart" from scratch with homogeneous SMs.

> It was seeing if
> different switches can be managed using different SM's.   This is
> important in mixed environments.

> I've personally found that there are many more problems now when mixing
> different vendors switches into the same subnet than their used to be.

What kind of problems ?

> InfiniBand used to be very good in this respect but I'd not risk trying
> to mix switches anymore, even when you
> turn off their embedded SMs and try to run with OpenSM.

Is there any documentation/reports/emails on the specific issues/problems ?

-- Hal

> 
> On 11/04/14 14:35, Hal Rosenstock wrote:
>> On 4/9/2014 2:53 PM, Edward Mossman wrote:
>>> Chuck,
>>>
>>> Thanks for taking the time to look through our test plan and provide
>>> suggestions. I will bring these suggestions to the next IWG meeting and
>>> we will vote on whether to include some or all in the test plan.
>>>
>>> Previously we were executing our InfiniBand tests with OpenSM as the
>>> Master SM, then disabling OpenSM and turning on the Subnet Managers that
>>> are included in our InfiniBand switches. This was meant to expose any
>>> interoperability issues with, say, a Subnet Manager on an Intel switch
>>> controlling a fabric comprised of Mellanox and Intel HCAs. We have since
>>> scaled back and only require a device to be interoperable when using
>>> OpenSM.
>> Switching between SMs is dangerous, is "beyond" the spec, and not
>> recommended and discouraged by IBTA (there is IBTA white paper on this
>> from IBTA MgtWG). It relies on similar behavior in various areas of SM
>> which are SM specific policies which might affect quite a number of
>> things in the subnet. The only thing that is supposed to work is
>> failover to same "flavor" of SM.
>>
>> -- Hal
>>
>>> Thanks,
>>> Edward
>>>
>>> On Mon, 7 Apr 2014, Chuck Lever wrote:
>>>
>>>> Hi Edward-
>>>>
>>>> After reviewing your NFS/RDMA Logo test plan
>>>> (https://iol.unh.edu/ofatestplan), I had some thoughts.
>>>>
>>>> No “vers=“ mount option is specified on your clients, thus only one
>>>> NFS version is tested.
>>>>
>>>> The default NFS version depends on the client version and server
>>>> configuration, so it is better to set the NFS version explicitly. I
>>>> recommend adding a specific “vers=“ setting on your scripted mount
>>>> commands, and run the cthon tests at least three times (with a
>>>> umount/mount between each run): once for vers=2, once for vers=3, and
>>>> once for vers=4 (4.0). Eventually 4.1 and 4.2 should be added when
>>>> Linux NFS/RDMA is updated to support those minor versions. For now the
>>>> two critical NFS versions are NFSv3 and NFSv4.0.
>>>>
>>>> Since you have a broad array of hardware in your test harness, that
>>>> would be an opportunity for more extensive platform interoperability
>>>> testing. The following areas might be interesting and appropriate.
>>>>
>>>>      • 32-bit v. 64-bit
>>>>      • 4KB pages v. other page sizes
>>>>      • Little v. big endian
>>>>
>>>> Simply ensure your test matrix includes these combinations of clients
>>>> and servers. Power will already get you large page sizes, for example.
>>>>
>>>> I am curious about the test plan’s requirement to run your NFS/RDMA
>>>> tests using every SM you have in your lab. Can you elaborate on that
>>>> requirement?
>>>>
>>>> -- 
>>>> Chuck Lever
>>>> chuck[dot]lever[at]oracle[dot]com
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> ewg mailing list
>>> ewg at lists.openfabrics.org
>>> http://lists.openfabrics.org/mailman/listinfo/ewg
>> _______________________________________________
>> ewg mailing list
>> ewg at lists.openfabrics.org
>> http://lists.openfabrics.org/mailman/listinfo/ewg
> 
> 
> 
> 
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ewg




More information about the ewg mailing list