[ofw] RE: installation/connectivity problems on hpc server

Anatoly Greenblatt anatolyg at voltaire.com
Mon Nov 16 13:52:26 PST 2009


This change was somewhere between ofed 1.3.1 and 1.4.1 timeframe. Until recently I used opensm with ofed 1.3.1.

The sa_query from __port_get_bcast fails.

But now I'm confused. I look into linux code and there the sm_key is not set aswell.

This leads me to think that the difference is in some packing or alignment.

Thanks,
Anatoly.

-----Original Message-----
From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com] 
Sent: Monday, November 16, 2009 11:31 PM
To: Anatoly Greenblatt
Cc: Sean Hefty; Fab Tillier; Tzachi Dar; Smith, Stan; ofw at lists.openfabrics.org
Subject: Re: [ofw] RE: installation/connectivity problems on hpc server

On Mon, Nov 16, 2009 at 4:13 PM, Anatoly Greenblatt
<anatolyg at voltaire.com> wrote:
> Thanks Hal,
>
> But the reason of difference in behavior of linux and windows nodes lies
> is probably that in windows drivers the sm_key is never set. The sm_key
> is defined in sa_hdr which is part of mad. If I'm not mistaken, mad
> buffers are zallocated.
>
> Opensm expect OSM_DEFAULT_SM_KEY and it worked until recently because of
> this endianess bug.

That change was made 5/22/08. That's recent in terms of Windows OpenSM :-)

> Sean, can you check if in linux code (specifically in ipoib default
> broadcast query/join path) the sm_key is set?

Which SA query/queries is/are failing ? The trust is only needed for
certain MCMemberRecord queries (not joins/leaves/etc. though), certain
ServiceRecord queries, PortInfoRecord and PKeyTableRecord queries, and
certain Sets of InformInfo.

-- Hal


> Anatoly.
>
>
> -----Original Message-----
> From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com]
> Sent: Monday, November 16, 2009 10:52 PM
> To: Anatoly Greenblatt
> Cc: Sean Hefty; Fab Tillier; Tzachi Dar; Smith, Stan;
> ofw at lists.openfabrics.org
> Subject: Re: [ofw] RE: installation/connectivity problems on hpc server
>
> On Mon, Nov 16, 2009 at 3:32 PM, Anatoly Greenblatt
> <anatolyg at voltaire.com> wrote:
>> I think this bug can be noticed only when opensm runs on ppc while
> host
>> run on intel platforms or vise-versa.
>
> From opensm_release_notes:
> OpenSM Compatibility
> --------------------
> Note that OpenSM version 3.2.1 and earlier used a value of 1 in host
> byte order for the default SM_Key, so there is a compatibility issue
> with these earlier versions of OpenSM when the 3.2.2 or later version
> is running on a little endian machine. This affects SM handover as well
> as SA queries (saquery tool in infiniband-diags).
>
>> Also, I'd appreciate if anyone can point me to the place where this
>> sa_key filled in sa query (or how it is mapped).
>
> The --smkey option in saquery allows this to be specified.
>
> -- Hal
>
>> Thanks,
>> Anatoly.
>>
>> -----Original Message-----
>> From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com]
>> Sent: Monday, November 16, 2009 8:53 PM
>> To: Sean Hefty
>> Cc: Anatoly Greenblatt; Fab Tillier; Tzachi Dar; Smith, Stan;
>> ofw at lists.openfabrics.org
>> Subject: Re: [ofw] RE: installation/connectivity problems on hpc
> server
>>
>> On Mon, Nov 16, 2009 at 11:46 AM, Sean Hefty <sean.hefty at intel.com>
>> wrote:
>>>>Stan/Sean, when you perform linux/windows dapl tests do you use
>> windows
>>>>or linux opensm and what is the value of sa_key?
>>>
>>> I always use a linux version of opensm, since I share switches with a
>> linux
>>> cluster.  I don't know what value the sa_key is, but it's 99% likely
>> to be
>>> whatever the default is.
>>>
>>> Do you know what exactly the sa_key maps to?
>>
>> The default sa_key depends on which version of OpenSM is being used.
>>
>> # Note that for both values above (sm_key and sa_key)
>> # OpenSM version 3.2.1 and below used the default value '1'
>> # in a host byte order, it is fixed now but you may need to
>> # change the values to interoperate with old OpenSM running
>> # on a little endian machine.
>>
>>> Is this how the user specifies the
>>> SM key, or is it something else?
>>
>> There are two separate keys: sm_key and sa_key.
>>
>> -- Hal
>>
>>> - Sean
>>>
>>> _______________________________________________
>>> ofw mailing list
>>> ofw at lists.openfabrics.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
>>>
>>
>



More information about the ofw mailing list